Skip to main content



Google’s BERT

In class, we have discussed the search industry, particularly with regards to internet search services such as Google. We’ve examined various algorithms behind these services such as the PageRank algorithm, and explored the market pricing for per-click advertising. However, another interesting element of the internet search problem is how to parse user queries. Google has, in the past, treated queries as a “bag of words,” selecting the most important words from the query to drive the search results. For example, if you search: “Who is the wealthiest actor?”, the words “wealthiest” and “actor” will have much more influence over how search results are chosen than the words “is”, “the”, and “who”. This is logical and reflects the meaning of the query in this case: pages closely related to “wealthiest” and “actor” with high page rank values will be shown (as well as relevant ads). However, consider the query: “How to park on a hill with no curb”. Using the old model, “park”, “hill”, and “curb” will be prioritized, often giving results on how to park on a hill with a curve, which is exactly the opposite of what is intended. Okay, then the easy solution is just to incorporate more words as priority words, right? Just add negation word “no” to the query. But this raises a new issue, to what does the negation correspond? In this case, perhaps it is intuitive to assume “curb”, but the ordering of the words is still important. Consider another case: “Can I get medicine for someone else from pharmacy”. The word bag approach again might fail here, not recognizing the important ordering of “for someone else”.

This is where Google’s recent query parsing update BERT comes in. BERT stands for Bidirectional Encoding Representations from Transformers and, basically, is optimizes natural language processing by employing artificial intelligence/machine learning algorithms on a large data set. This new parsing strategy, which is in the process of being rolled out to US customers, will be able to successfully handle the querying examples given above, as well as other complex queries. But exactly what impact will this have? To the average user, the change will not be too significant; their queries will be parsed in more intelligent manner but it is not a massive change. However, to companies and others reliant on search result ordering, the ramifications could be significant. Imagine you poured significant monetary and time resources into ensuring your website was a first page result for many related search queries. Now suddenly, with the BERT update, your website is a second page result and your user traffic has gone down 30%. Such a change can be catastrophic and is the reason why, as we have discussed in class, there is an entire industry centered around optimizing websites for search page result ordering.

Overall, I think this back-and-forth between Google (and other search engines) optimizing search and query handling for increased understanding, and then websites working to exploit the new system to increase their result position. I think it would be very interesting to look at how BERT affects page rankings in the long term: as some sources fall in relevance, they may be linked to less and lose ranking, further decreasing their result position and beginning a negative feedback loop. This would obviously be quite extreme, but could be an interesting edge case to look into.

 

Sources: 

Haden, Jeff. “Google Just Announced a Major Search Algorithm Change That Users Will Probably Love, and Some Businesses May Absolutely Hate.” Inc.com, Inc., 25 Oct. 2019, www.inc.com/jeff-haden/google-just-announced-a-major-search-algorithm-change-that-users-will-probably-love-and-some-businesses-may-absolutely-hate.html.

Bensinger, Greg. “Google Is Making a Big Change to Its Vaunted Search Engine. You Might Not Notice.” The Washington Post, WP Company, 25 Oct. 2019, www.washingtonpost.com/technology/2019/10/25/google-is-making-big-change-its-vaunted-search-engine-you-might-not-notice/.

 

Comments

Leave a Reply

Blogging Calendar

October 2019
M T W T F S S
 123456
78910111213
14151617181920
21222324252627
28293031  

Archives