Combating Web Spam with TrustRank
Source:
http://ilpubs.stanford.edu:8090/638/1/2004-17.pdf,
In the course, we extensively discussed the idea of page rank and how its radicality caused Google’s rise to become an internet giant. But the great concept of hubs and authorities which page rank works on also makes it easily susceptible to exploitation. For instance, a lot of people, including legitimate organizations, use black SEO where they add hidden anchor tags to all over their site and try to spoof the algorithm and get a better ranking. PageRank is also vulnerable to spamming from users as well as fake news. This vulnerability results from the inability of PageRank to tell good authorities from the bad ones.
To combat this, Google has added a new approach called TrustRank, where the variable of “trust” is added to PageRank. Now instead of just relying on the number of web links from a page to another it also accounts for the trust it has on the page that is linked to. TrustRank uses an interesting algorithm to generate trust for all indexed sites. It then uses the trust to give sites with a lot of spam content a much smaller score while positively affecting sites which link to trusted authorities. This new approach also prevents cascades of fake news to build up while also penalizing people who rely on dirty tricks for SEO.