Fake News and PageRank
PageRank, an algorithm that ranks webpages based on the relevance of in-links and out-links on sites, has been in decline with the increasing size and complexity of the web which has required new methods to sort data. Search engines are able to optimize which sites appear as top recommendations by considering keywords, anchor text, and usage data. In the PageRank algorithm, each node divides its current page rank to all the nodes in the system it directly endorses until each node has received a new value to achieve equilibrium.
Facebook has faced allegations of having “fake news” on the site recently which has caused public outrage, specifically regarding the 2016 presidential elections of the United States. As a result of fake accounts from countries such as Russia and other Eastern European countries sharing misinformation about the elections, there is suspect that Donald Trump was able to gain popularity through this influence of fake news and this may, in fact, have impacted the results of the elections.
In order to combat the issue of fake news, Facebook announced that it will be using a new algorithm called Click-Gap that functions similarly to PageRank. The idea behind Click-Gap is that Facebook will filter for links that have a high click-rate within the site, however, not on the entire Internet. The way in which this algorithm works is by utilizing a conceptual map of the web to identify the authority and hub scores of the links on Facebook. In the map, the links with high authority and hub scores are at the center of the graph and links with lower authority and hub scores are on the edges of the graph. Using user-click data from Facebook, the site can compare the relevance of this site on Facebook to its relevance on the web to determine if the link leads to a real, credible source.
In lecture, we discussed the endorsement procedures for updating hub and authority scores called the Hub Update Rule and the Authority Update Rule which involve counting the number of hubs to calculate the authority score and then recalculating hub scores based on the new authority scores. We can then compute normalized scores to see which authorities are the most prevalent. We also learned that the PageRank is at equilibrium when the normalized scores all add up to one and with a basic upgrade retain their values. Click-Gap functions similarly to PageRank by analyzing the hub and authority scores of articles published on the site. Based on the principles we discussed in class, one can conclude that these values will remain at equilibrium no matter how many updates are performed.
I think that this method will be effective to determine which sites receive traction outside of Facebook. This is because PageRank and hub and authority score updates both use information based on the number of inlinks and outlinks through iterative cycles to result in an upper limit or normalized scores in equilibrium which will provide information about which sites are more prevalent on the web. Articles linked to Facebook that do not have much traction on the web besides on Facebook are more likely to be spam because it is unlikely that this article would only be viral on Facebook. Although PageRank has been on the decline for Google’s search algorithm, understanding the principles behind it allow us to develop new ways to apply the algorithm.
source: https://www.cnbc.com/2019/04/10/facebook-click-gap-google-like-approach-to-stop-fake-news-going-viral.html