Analysis of the principle to determine whether a site search engine cheating (two)

hypothesis 1: "the closer the distance of trusted trust, here the distance refers to the number of steps to link to.

trust value distribution strategy, to obtain the trust value in accordance with the "average distribution chain number, if a page is a K chain, each chain trust score assigned to 1/k, and will give value chain transfer.

TrustRank algorithm need to rely on manual review to determine whether a web page should be put into the set, taking into account the artificial audit workload, so we put forward two initial trust page collection strategy, set in the primary basis by manual review.

Step one:

first analyzed here, analysis principle to determine whether a site search engine cheating (three) will explain the BadRank algorithm for everyone, specific to my blog (贵族宝贝30l>

determine the trusted web pages

through the combination of the above two communication strategies can be spread between page node trust score in the final result, below a certain degree of trust will be considered cheating web page.

The so-called

undertake analysis principle to determine whether a site search engine cheating (1)

in a certain way

TrustRank algorithm

trust attenuation, i.e. distance farther "credible", the trust score spread through the smaller the.

We first introduce the

TrustRank algorithm belongs to trust propagation model, basically follow the trust propagation model of the process, the algorithm consists of two steps as follows.

Guangzhou Shanghai dragon Chen Yong for everyone to continue to analyze the trust propagation model, trust propagation model and anomaly detection model of 3 representative algorithms are TrustRank algorithm, BadRank algorithm and SpamRank algorithm.

* 2: PR primary strategy (Inverse PR), inverse PR in the process of calculation, is calculated according to the "chain into the incoming weights, the inverse PR on the contrary, calculated according to the page out of the chain from the weights of the links between web pages pointing to the relationship between higher inversion, selected a set of molecules as the primary page.

in this step, belief propagation algorithm based on TrustRank is the following two hypotheses.

score from the white list of "trust" spread to other

* primary strategy 1: high PR score ", that higher PR scores" is reliable, so can calculate the PR value of the page after the extraction of a small amount of high scores as the primary web page set.

hypothesis 2: a high quality page contains a little chain, so is the web page is the possibility of high quality web pages is small.

Step two: the The so-called

