Skip to main content



Flaws in Google’s Top Stories Algorithm

Sources:
https://www.theverge.com/2017/10/3/16413082/google-4chan-las-vegas-shooting-top-stories-algorithm-mistake
http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm

 

After the Las Vegas shooting earlier this month, it became clear that while Google has developed a very powerful search engine, it is nowhere near perfect. Following the event, a search of what occurred would first return links to two 4Chan articles, which both incorrectly identified the shooter.

Google’s engine evaluates the absolute “importance” of websites and returns links based on the search query, but they have also recently implemented a feature called Top Stories, which identifies pages that may not be as “important” but contain fresh content about a recent event. As such, these stories are given some additional weighting for “breaking” content, which brings them to the top of the search results, along with the more “important” but less “breaking” pages.

This allows for articles relevant or current news to break through the mass to be shown at the top, even though they were only recently created, but this feature doesn’t come without its faults. By opening up this channel for factual news articles to pass through, it is also opened for inaccurate news to pass through as well. As a result, inaccurate news can spread at an alarming rate, not previously possible before the creation of the Top Stories feature.

This can be related to our in-class discussion of search algorithms and hub and authority scores. While Google made an effort to push news stories to the front of their search results, the rating of the authority providing that information must not be overlooked. The algorithm used to determine the Top Stories is not public, so it is difficult to pinpoint exactly where the fault may lie. Although the algorithm abandons the idea of having a set list of validated news sources to choose from, it could still try to determine if a source resembles a news outlet rather than a forum, where anyone could easily and quickly post inaccurate content in the moment.

Obviously, it is difficult to even know whether news source would be reputable, but it would almost always be more accurate compared to the content of forums or social media sites. Since social media and forums have many links within themselves and between each other, they are able to develop high hub scores even though those hubs are not really relevant for news searches. One possible change would be to have a set list of sites that can contribute to hub scores for news outlets, so there is some wiggle room for determining which sources can contribute to top stories, but it is still limited to sources that can be verified to some degree.

Comments

Leave a Reply

Blogging Calendar

October 2017
M T W T F S S
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

Archives