Watch out PageRank! Here comes RankBrain: Google’s use of AI to Optimize and Improve Search Results
As we learned in class, PageRank was one of the earliest search optimization algorithms developed by Google Founders Larry Page and Sergey Brin, and it was implemented early in the company’s history. Using refined text-matching analysis and the number of ingoing and outgoing links and connections to web pages, the PageRank Algorithm is able to determine the most relevant and important pages to a certain search. Specifically, PageRank can be computed as follows for a small network of webpages:
- In a small network of n nodes, all the nodes start out with the same initial PageRank of 1/n.
- Then, using a k number of steps, a sequence of k updates are performed on the PageRank values.
- During the update, each page divides its current PageRank equally across its outgoing links and each page updates its new PageRank to be the sum of the shares it receives. If a page does not have any outbound links, it passes all of the PageRank to itself.
Google expands this procedure into its search function, where this process is repeated many times until the PageRank for every webpage relevant to a certain search is computed. Afterwards, Google lists the web page with the highest PageRank at the top of the search results. The PageRank algorithm has been the backbone of Google Search for years and is one of the main components of Google’s Search Algorithm, Hummingbird. Hummingbird is the official name for Google’s Search Algorithm and takes into account more than 200 major factors and over 10,000 minor factors when determining the result rankings for a certain search, such as location, presence of the search term in the HTML code, speed of the website, mobile-friendliness, and of course, the PageRank of the page and/or domain. Many experts believe that the PageRank algorithm is either the most important or second most important factor in determining the search result rankings for a certain query.
Recently, however, Google announced that it has developed a machine learning artificial intelligence system, called RankBrain, which has been used to help compute and rank search results. RankBrain is also part of the overall Hummingbird Search Algorithm. RankBrain has been so successful over the past few months in testing that Google determined it to be the third most important factor in determining the search result rankings for a certain query. Google mainly developed RankBrain to assist and find pages for difficult searches, especially since 15% of the 3 billion searches per day that Google processes have never been seen before. Previously, Google used to deal with these complex and unseen queries by having programmers and workers refer the search to synonym lists or creating large database connections. However, with RankBrain, the system can effectively interpret the word or phrase in the background to find the most relevant topics for a complex search. As Google stated, “RankBrain is able to see patterns between unconnected complex searches and understand and identify their similarities”. Moreover, RankBrain learns from these situations and uses the reasoning in the future. In other words, RankBrain analyzes an original long, complex, and unseen query and tries to connect it with a shorter more commonly searched request based on similarities in the original search. Therefore, RankBrain can use the information it knows about the more commonly searched query and update the long and complex query with similar results.
If RankBrain sees a phrase that it is completely unfamiliar with, it can make an educated guess, based on previous similar searches, and filter the search results using the data. Furthermore, Google updates the system by uploading it with new data frequently to help it better reason with new and unseen concepts. Google also gives it large amounts of historical searches and makes it predict the results accurately. This new major factor of the Hummingbird search algorithm, can work extremely well with the PageRank algorithm by cooperating together. For example, if a certain complex and unseen search pops up, RankBrain can compare that query with similar queries that have occurred in the past. Then, using PageRank, the Hummingbird search algorithm can determine the highest PageRank scores from the results of the similar queries and relay them as the highest rated, or most linked results, for the uncommon query.
Based on the recent success of RankBrain (in a very short time span of just months), however, I believe that it will become more and more effective and efficient in ranking webpages as artificial intelligence and machine learning develop. I think that it will be so effective in optimizing search results by using previous data and knowledge from searches, it could possibly overtake the PageRank as the possible most important factor in determining search results for a search very soon.
Sources:
http://searchengineland.com/faq-all-about-the-new-google-rankbrain-algorithm-234440
http://searchengineland.com/what-is-google-pagerank-a-guide-for-searchers-webmasters-11068
The Course Textbook: Networks, Crowds, and Markets by David Easley and Jon Kleinberg