Search Engine Optimization
Over the past ten to fifteen years, the sheer amount of underlying sophistication behind the growth and development of search engine optimization has increased at an exponential rate. Whereas search engines from ten to fifteen years ago implemented searches with comparatively simple approaches, usually heavily involving Boolean and logical operators, the evolution of search engine optimization has led to highly multivariable and idiosyncratic approaches in providing precise, relevant results in the matter of seconds. While various search engines may employ different approaches to ordering their results, there are a number of common techniques and ideas that are incorporated into search engine optimization, including zone indexes and relevance feedback.
The idea behind zone indexes is that every web page can be potentially split into a total of four zones: title, description, author, and content. Conceptually, giving each zone a specified weighting can make it easy to accurately tabulate a score for how relevant the page is to a term being searched – ideally, a page should have higher weighting in the content and title zone as these regions are where we most expect to find the most concentrated relevant content. In implementing such a method, one is able to filter out any non or less-relevant results that might previously have found a way to circumvent the search algorithm and appear as one of the first few results.
However, there is one large question. How is it that a search engine is able to differentiate different types of zones on a web page? Generally speaking, search engines can process the text to code ratio on a page to determine the zone type. A block with large text to code ratios indicates that it is likely to be a content zone, while a zone with minimal text to code ratio indicate that the block is more likely to be something like a menu on the page. Although note that these automated distinctions can have various levels of accuracy, dependent on the editor being used to dissect the web page as well as the type of document in question. Specifically, editors seem to handle XML documents quite well, while HTML are more difficult.
As an alternative approach, another highly used idea for search engine optimization is the concept of relevance feedback. The term refers to the practice of assigning varying values to the terms in a query with regard to an appropriate result. Even within this method there are various ways of achieving relevance feedback, but a common approach is to use Rochio’s feedback formula, shown below in which coefficients alpha, beta, and gamma denote values for having query terms, relevant terms, and irrelevant terms respectively:
As a conclusion, one should note that the aforementioned practices are only a few of the simpler algorithms that search engines use to provide relevant pages. A large portion of algorithms are not publicly accessible, as this would possibly compromise the integrity of search engine results, since a number of sites try to attempt to reverse engineer algorithms in hopes of acheiving higher scores in search engine listings.
https://moz.com/blog/search-engine-algorithm-basics