Skip to main content



YouTube Recommendations and Information Cascades

I was browsing through my “Home” page on YouTube the other day when a thought crept into my head: how do search engines like Google know what kind of ads to recommend to you, and how do sites like YouTube know what un-watched videos to recommend to you?

The answer is obvious, of course: these sites simply keep track of your data and decide what to recommend. But my thought wanted to know more – how does YouTube, for instance, decide what content to recommend to me based on this data? How does it know what is similar?

I began doing a brief search of the web (ironically through Google’s personalized results) and came across this interesting article about the history of YouTube’s algorithms:

How YouTube perfected the feed – The Verge

Most of the content of the article is not actually about the in-depth mechanisms of how YouTube’s algorithm fundamentally works. It does, however, offer a fascinating insight into the history of the algorithm responsible for controlling the search aspect of YouTube. It describes, for example, how the previous algorithm, known as Sibyl, was very good at finding content that was just like what you were previously watching to recommend to you. The new algorithm, Google Brain, is good at making inferences based on data where even human insight would not be able to offer a conclusion. These inferencing capabilities are used to search YouTube’s many videos and make inferences about the content to recommend to a user.

Despite the lack of detail, I still found the answer within this article. The YouTube algorithm that is responsible for recommending existing users’ new content takes into consideration what content the user has consumed, and then uses this data in corroboration with YouTube’s own input to determine what content is similar but not exactly the same as what the user has been watching. The algorithm then recommends these similar pieces of content to the user with the expectation that they are similar to what the user has already consumed.

As I came to understand this functionality, I also came to realize that it is incredibly similar to an Information Cascade as discussed in the course. The information that is cascading, in this case, is the content that a user on YouTube is consuming. The algorithm, essentially acting as the decision-maker within the cascade, uses the previous information that it knows to inform its decision-making.

This would also mean that the algorithm is therefore susceptible to an incorrect information cascade. If, for example, a user begins watching videos about a topic that they may necessarily need at a certain moment, but has no future interest in, the algorithm will nonetheless believe that this user, therefore, wants to watch content related to this topic for a period of time. I can even offer a personal example of this: I always forgot how to tie a half-Windsor knot whenever I need to wear a tie. Usually, I end up watching three or four “how-to” videos for this, and after doing so the YouTube algorithm will recommend me many similar videos about tying knots for the next hour or so.

Most of the time, of course, this algorithm is very good at recommending content that you want to watch next. Still, it is fascinating to understand exactly how YouTube can end up recommending you tens to hundreds of videos that you just absolutely do not want to see.

Comments

Leave a Reply

Blogging Calendar

November 2022
M T W T F S S
 123456
78910111213
14151617181920
21222324252627
282930  

Archives