Behind Spotify’s “Discover Weekly” Playlist
Every Monday, Spotify releases a “Discover Weekly” playlist: a mix of 30 songs personalized for each user based on their music taste. Many have praised the accuracy of this playlist, but how is it put together with such precision? This article first delves into the history of song recommendation systems, starting with “manual curation” in the 2000s, where experts would hand-pick a set of similar-sounding songs, and users that liked such songs would simply listen to these playlists. Pandora and The Echo Nest then took this a step further by analyzing and categorizing each track’s content so that they could be later filtered and recommended to users. Finally, Last.fm implemented a process called Collaborative Filtering, which is a major part of the recommendation system that Spotify and many other organizations use today. As described in the article, Spotify uses three recommendation models to put together Discover Weekly: Collaborative Filtering models, Natural Language Processing (NLP) models, and Audio models:
Collaborative Filtering: takes in both explicit and implicit feedback to figure out users with similar taste and recommend content based on this data. Spotify relies mostly on implicit feedback to provide recommendations, including the stream count of a track, whether a user adds a track to their playlist, or whether a user visits an artist’s page after listening to one of their songs. Spotify interprets each one of these actions as a sign that a user likes/does not like a song. Collaborative filtering uses this data to figure out what users are similar, and suggest songs that one user hasn’t heard yet, but similar users have listened to and liked.
NLP: These models track metadata, news articles, blogs, and other text around the internet. On a high level, Spotify uses web crawlers (which we discussed in class) to figure out what people are saying about certain songs and artists — what adjectives and language is frequently used about those songs, and which other artists and songs are also discussed alongside them. Much like in collaborative filtering, the NLP model creates a vector representation of songs and artists based on this data to determine similarities between songs.
Audio: In this model, convolutional neural networks are used to deduce the characteristics of a song, such as its tempo, acoustics, and danceability. This data is then used to compare songs and thus suggest similar songs. The benefit of adding this extra model is that, unlike the previous two models, it does not have a bias towards songs that have been out for longer and thus talked about/streamed more.
Spotify’s approach to their “Discover Weekly” playlist has a lot of overlap with the topics that we have discussed in class. Specifically, there are many parallels between the way Google must return relevant search results and the way Spotify must return relevant song recommendations. We talked in class about how Google uses web crawlers to index and structure all of its information. Spotify, like Google, faces the challenge of going through millions of possible results and providing users with those that are most relevant to what the user wants. As we learned, there are many complications with finding this relevant information, including information abundance, where to crawl, and bias. As described above, Spotify attempts to combat the bias of their first two models with their audio models. The NLP model, which uses web crawlers to collect data about songs, has to figure out how to navigate a large amount of information. Spotify also deals with the issue of information abundance, as there are many songs that can be recommended to a user that would be “good”. As we learned in class, counting in links is a good start: the most relevant results are often correlated with the number of inlinks to them. However, since there is not one “right” answer to a good song for a user, the problem becomes more complex. The structure of Spotify’s recommendation system is comparable to the hub-authority structure that we discussed in class, with the users as the hubs and songs as authorities. Good songs for a user are pointed/listened to by many similar users. Likewise, the best (or most similar) users will point to “good” songs (songs that similar users would like). Based on these characteristics, a score can be assigned to each hub and authority to determine how “good” it is. On a very basic level, this is how Spotify’s collaborative filtering model figures out the recommended songs for a user. It is also worth noting the similarities in the implicit data that Spotify collects and the data that Facebook collects that Lars Backstrom spoke about in class. Facebook has some sort of point system based on how much a user likes a post: some points are allocated if someone clicks on a profile, more for a like, and even more for a comment. Spotify does precisely the same, judging a user’s music taste on how long they play a song, which songs they skip, which artist’s pages they click on, etc. Of course, what is really going on behind-the-scenes in Spotify’s algorithm for the “Discover Weekly” playlist is much more complex, but is nevertheless strongly related to what we learned in class about web search, hubs and authorities, and Facebook’s news feed algorithm.