Skip to main content



The Significance of Defining Importance: Proving the Wisdom-of-Crowds Justification of Link-Based Ranking Algorithms

As we learned in class, the process of web searching and analyzing the connection between webpages plays an important role in understanding the reasons why people choose to visit various websites. An interesting paper I read claimed that while the PageRank Update Rule did have an influence on determining the importance of certain sites, there is limited evidence to prove the wisdom-of-crowds justification for PageRank. Thus, they suggest a dual model approach to combine various understandings of the relationships of web searching. The way the PageRank algorithm of Google works is that the more web pages linked to a given page, the higher the page’s PageRank. Additionally, the page’s PageRank decreases as the pageRank of the pages that link to it increases. This shows how PageRank not only depends on the local topology of the web but also the global topology. Google founders had given what is known as the wisdom-of-crowds justification for PageRank(WCJPR) but have yet to conclusively define and prove the WCJPR thesis. 

The way that PageRank operates leads to the claim that while one link may not be an indicator of importance, but the accumulation of them is. This supports the wisdom-of-crowds theory that “even if most of the people within a group are not especially well-informed or rational, [the group] can still reach a collectively wise decision”. The first point the paper made was that while current preferential models were somewhat realistic models of the complete web, they do not offer evidence to help prove the WCJPR thesis. For instance, assuming that a random surfer is given a webpage and keeps clicking links to related web pages, the PageRank does increase. However, this does not necessarily correlate to the idea that web pages with a higher PageRank will have a higher importance or quality than other web pages. It could just indicate that some web pages are more popular but not necessarily informative or useful. Thus, PageRank’s underlying assumption that more important websites are more likely to receive links from other web pages is not true. 

The paper’s second argument is that while analysis of a linking model proposed by Masterton does try to account for the PageRank reasoning, it is not a realistic model because it does so for the wrong reasons. The Masterton (MOA) model assumes that the strength of the attraction to importance varies with the competence of the webmaster of the source page with more competent webmasters administering other pages. Thus, this means that the more competent the webmaster, the more important web pages it will link its source page to, and the worse the webmaster, the weaker and more random links it will make. Through using page importance, a competence factor, and other constraints, the researchers were able to create models to compare the relationships between the variables. While the model does satisfy proper degree distributions, it is quite ad hoc when choosing the importance distribution to generate the right webpage topologies. Since the importance distributions would have to be odd results, with the only purpose being to generate the correct topologies, then it shows that there is not a clear and direct link between importance and higher PageRank values. 

Due to these two points, the paper concludes that there is no justifiable explanation for the wisdom-of-crowds theory. This conclusion is very relevant to the web search concepts that we learned in class because it indicates that even though we have a method to numerically account for importance between website networks (PageRank), we still have a rudimentary understanding of web links and how networks in general work between web users.

Reference paper: https://link.springer.com/article/10.1007/s13347-017-0274-2

Comments

Leave a Reply

Blogging Calendar

October 2022
M T W T F S S
 12
3456789
10111213141516
17181920212223
24252627282930
31  

Archives