Skip to main content



Google’s Top Stories Feature vs. PageRank

https://www.theverge.com/2017/10/3/16413082/google-4chan-las-vegas-shooting-top-stories-algorithm-mistake

This article is incredibly interesting because it discusses Google’s search algorithms and the effectiveness behind some of the decisions Google has recently made related to its search feature. The article focuses on Google’s Top Stories feature, and discusses a flaw with the algorithm behind this feature which caused 2 4chan Threads, which identified the wrong man as the shooter, to occur in the Top Stories feature after the mass shooting in Las Vegas earlier this month. This is very problematic because these articles had the wrong name of the killer, and thus shamed a completely innocent man, while also delivering fake news to the public. The problem with the algorithm behind the Top Stories feature is that it does not consider accuracy of information, and instead focuses on displaying articles that are most popular and receiving the most clicks and traction. This article is very interesting in that it explicitly states an argument that I’ve never actually stopped to think about: Google’s search algorithms may implicitly promise accuracy, but Google never actually explicitly promises to deliver the truth.

This article relates a lot to our class discussion on PageRank. In class, we talked about direct endorsement and how PageRank assigns scores to webpages depending on how often they are linked to by other pages. We also discussed how search results are built on information retrieval, and how web searches ’crawl’ the web and use links to build a giant index of search results based entirely on how often these links are referred to by other links. Also, in class we discussed the two possibilities for creating the giant index of search results: scarcity, when there may only be one document to answer your question, vs. abundance, when there may be millions of relevant pages and you need to find the best ones. This article reminds me a lot of the abundance case, and this articles also stresses how PageRank’s algorithm is more effective than Top Stories’ algorithm.

The article highlights that the only difference in its Top Stories algorithm that Google has revealed is that it takes into account a “freshness” quality for articles that appear in the Top Stories feature. Google will not specify any more about its algorithm though. This article is very interesting in that it questions why Google doesn’t just not use Top Stories and instead use its original PageRank search algorithm to identify these “Top Stories.” I think this argument is especially relevant to our class discussion because it’s implicitly saying that Google’s PageRank algorithm is successful in taking into account how many other pages link to another link because when a page links to another link its implied that there is some credibility in that link. This holds true assuming that the link linking to another link is also credible, which we can assume the majority of links are. And, of course, this articles relates to abundancy, which we discussed in class, because Google’s algorithms are finding a way to present links when there are millions of relevant pages that could be indexed.

Comments

Leave a Reply

Blogging Calendar

October 2017
M T W T F S S
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

Archives