Skip to main content



After PageRank–Mixed Ranking Algorithm and Knowledge Graph

I read from one another blog post that Google stops updating its PageRank toolbar in 2013, and I become interested in Google’s recent updates of its searching algorithm. In 2011, Google released Pandas in order to rank down low-quality pages. According to CNET, the change leads to “a surge in the rankings of news websites and social networking sites and a drop in rankings for sites containing large amounts of advertising”.[1] Google announced Penguin in 2012, which would ban some black-hat SEO practices that manipulating links in-degree. The update will discourage these biased SEO practices and make page ranking more reliable. Google also introduced Hummingbird in 2013, which does semantic search. It involves more language processing and should help the search engine understand queries better. As Google accumulates millions of users’ searching history, it Google search chief Amit Singhal told me that perhaps 2001, when he first joined the company, was the last time the algorithm was so dramatically rewritten.

Google make a lot changes over the years, but Google search chief Amit Singhal actually said that the algorithm has not been dramatically rewritten since 2001. [2] Google uses its understanding of users and develops auxiliary ways in assist to PageRank in order to optimize the result. It cares more the quality and relevance of pages more and integrate the index of them in the ranking.

Wikipedia listed several alternative algorithms of PageRank, including HummingBird and IBM’s CLEVER. [3] While HummingBird makes improvements on PageRank, IBM’s CLEVER is based on Kleinberg’s  “Authority” and “Hub” model. To improve return pages’ quality, it tries suppressing nepotistic links, introducing exemplary hub and authority and weighting relevance. If you want to know more about CLEVER, you can check out the Paper “Core Algorithms in the CLEVER System.” [4] From that paper, I also learned that search engines actually takes in 5 sets of queries when we are searching things. (not sure if that is particular for CLEVER or for all search engines.But both CLEVER and HummingBird all seem to be the optimization of previous “authority and hub” algorithm and PageRank algorithm, or a mix integration of them. I am wondering if there is any third applicable searching algorithm different from its root.

Search engine companies seem to shift their focus to language processing and knowledge graph. Search engines want to know what the users are really asking for and what the content is really talking about, instead of just matching the keyword. Search engines also tries to go further and turn the unstructured text to structured format. I am now quite used to knowledge graph by google, but it was a really exciting update. I think there will be more breakthrough in that area and will definitely change our searching  experience. Knowledge graph may enable us getting to the answer much faster in the future, but may also encourage lazy thinking habits if people rely on search engines’ selection of information too much.

Reference:

[1]https://en.wikipedia.org/wiki/Google_Panda

[2]http://searchengineland.com/google-hummingbird-172816

[3]https://en.wikipedia.org/wiki/PageRank

[4]http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.94.8554&rep=rep1&type=pdf

 

Comments

Leave a Reply

Blogging Calendar

October 2016
M T W T F S S
 12
3456789
10111213141516
17181920212223
24252627282930
31  

Archives