Bayesian Models in Collaborative Filtering
In its most basic form, Bayes Rule describes the probability of an event based on prior beliefs or knowledge related to the event. This is intricately linked to the concept of conditional probability and the science of machine learning – applying the core ideas of probability to large datasets. Bayesian methods combine data with existing information relating to that data and then utilize algorithms and models to calculate results. Bayes Rule has significant application in many tech giants, such as Netflix, Amazon, Facebook, and Google, which collect large amounts of data and rely on predictive analytics to retain their customers. These companies rely on recommendation filters, built under the assumption that people who prefer a product will probably prefer another.
An example is Netflix – the world’s leading Internet television network with “over 100 million members in over 190 countries, enjoying hundreds of millions of hours of content per day”. It is no surprise that the company researches extensively on machine learning to continually improve personalization in user experience and optimize the Netflix service. One major aspect of the Netflix user experience is its recommendation system, which seeks to show the user a set of previously unseen content that the algorithm will think he/she will like. With a large catalog of media content and an even larger user base, a good recommendation system would enable users to interact more effectively with Netflix.
Collaborative filtering is one of the approaches used by Netflix’s recommendation system. This is by learning and predicting what a user prefers by exploiting similar patterns and characteristic across what other users watch. By Bayes Rule, we want to calculate the conditional probability of a user watching movie given that they have already watched another movie , i.e. . First, we calculate the probability that a Netflix user watches a certain movie. Since the number of users on Netflix is large, we can approximate this probability by dividing the number of users who watched that movie by the number of users on Netflix, by the Law of Large Numbers.
Then,
While this model is relatively simple, this example demonstrates that the concept of Bayes Rule is heavily utilized in the real world. Incorporating this concept in machine learning algorithms certainly creates many challenges and opportunities in improving user experience.
https://research.netflix.com/research-area/machine-learning
https://towardsdatascience.com/what-is-bayesian-statistics-used-for-37b91c2c257c