How Recommender Systems Work
The recommender systems usually work in several steps. The first step is called “candidate generation.” In this step, systems pick out a few recommended “candidates” for the users from thousands or millions of contents. Then, if the candidates come from different candidate generation systems, additional scoring systems are used to normalize the scores. Assuming that users tend to like content with higher scores, the recommender systems finally rerank the contents and give users content with the highest scores.
There are two kinds of candidate generation systems: content-based filtering systems and collaborative filtering systems.
Content-based filtering systems assign users and posts predefined labels to predict how likely the user will like the content. For example, if the user is labeled as “like basketball” and “dislike baseball,” the recommender system will tend to give contents with the label “basketball” higher scores than those with the label “baseball.” Certainly, in real life, there’s no pure like or dislike. The preferences instead have scores in a certain range to reflect users’ interests. If the user reacts more positively to some features, the scores on those features will be higher. Instagram uses the social graph to get a better understanding of what the user is interested in. For example, the system recommends posts similar to ones that users liked before or posted before. This filtering system does not require other users’ data when generating recommendations for a specific user. One potential disadvantage is that the system will not generate contents that the users will potentially like. All the recommendation information comes from users’ activity history. It’s not hard to imagine this kind of system leads to information cocoons. People can only see what they want to see but not the whole picture.
Collaborative filtering systems, on the other hand, do not have predefined labels for users or content. They use patterns instead. For instance, if A is similar to B, things that B likes will be recommended to A and vice versa. This is an example of user-user filtering. Another algorithm is item-item filtering, where the system recommends items similar to things you bought or seen before according to your activity history. User-user filtering can be described as “people who like this also like this,” and item-item filtering is “because you like this, you may also like this.” The system will adjust users’ preferences from feedbacks such as the rating of the contents. Tik-Tok seems to use collaborative filtering systems. When users just start to use Tik-Tok, they simply start to skim through random videos. The recommender systems gradually map out users’ interests according to how long they watch certain videos. Although this system can recommend new content to users, it’s still hard to avoid bias. The system will not recommend totally unrelated content after all.
I have to admit the recommender systems work really well these days. The recommendations from Amazon, Youtube, etc., are really accurate. It’s easy to spot bias as well. My Youtube recommendations are mostly video games. What’s worse, there are mostly games that I played before and almost no new games. They probably should implement the concept of weak ties into the recommender system. Just as people with weak ties can still know each other, people who are not completely similar can also have similar interests.
Source:
https://towardsdatascience.com/based-on-your-activity-you-should-like-this-instagram-vs-tiktok-5fcfa1c07e46
https://towardsdatascience.com/introduction-to-recommender-systems-1-971bd274f421