Google announces detailed changes to their search algorithm
http://www.huffingtonpost.com/2011/11/14/google-algorithm-changes_n_1093429.html?ref=technology
Google makes about 500 changes to its search formula every year. While most of its formula is kept a company secret, Google has started publishing some of the changes it has made through its own blog posts. In last Monday’s blogpost, it described 10 most recent changes, and they are described in detail in this blog: http://insidesearch.blogspot.com/2011/11/ten-recent-algorithm-changes.html. Although one may never fully understand Google’s secret formula for its search engine, one can appreciate the valuable information retrievable from changes that they announce; two particular changes captured my attention, called “Fresher, more recent results” And “Retiring a signal in Image search.”
The two changes mentioned above serve essentially the same purpose. Any search engine results show the web pages of highest “qualities (relevance, popularity, etc),” and because the web is continuously evolving with flood of new information poured on every second, a web page that was previously ranked the highest may not be the highest ranked (or shouldn’t be) page tomorrow. Google considered these factors into its Image search, and aimed to refine their search results based on “retiring” images that no longer appear to have a significant impact. Instead of just rearranging the already existing results, retiring means getting rid of some results because there are just too many choices and results a user can process. Unlike web pages in which one can retrieve a relatively ample amount of information of users (preference, in-links, etc) in order for search engine to provide the most accurate and consistent results, Image search does not provide much information. When one searches for Image, one scrolls down first one or two pages of results, and clicks on one image based on many differing personal factors.
In this post, I wish to emphasize the term, as Google used, retiring information, as it seems to hold the key to future search engine formula for not only images, but also most likely web pages as well. The reason such measure of retirement of information can be explained through chapter 18.2 of the textbook regarding power laws. A power-law distribution shows up as a straight line on a log-log plot (see Figure 18.2 of textbook). But as mentioned, the web is continuously growing. Even though at current recurring findings show that the fraction of Web pages that have k in-links is approximately 1/k2, the total number of pages will increase, causing the whole straight line in Figure 18.2 to shift upwards. And even the rate of this shift, representing increase in user input is also increasing due to personal gadgets that provide internet access, such as Smartphones.
The effect of increasing user input to the web as a whole can be explained simply using the number of views in a Youtube video; a million views on a video was fairly rare and made the video a global hit only a few years ago, whereas a million views on a video today is very common. Just like a million-hit video is not the most accurate result today, Google saw the necessity to retire some images that previously got millions of attractions, but not so relevant later. It would not be too far in the future when the “popularity” or “number of in-links” become obsolete measures to rank web pages. One can only guess the new algorithm that will alter the ever-changing Google engine.