The Inner Workings of the Music Genome Project
The Music Genome Project may sound unfamiliar at first, but most internet users will recognize Pandora, the internet radio website that utilizes this technology. Pandora is popular not only because it is mostly commercial-free and can be found anywhere with internet access, but because its radio stations can be customized by the user. Initially, one song is chosen, and a radio station is created based on this song. This radio station then plays songs considered “similar” to the original choice, and the user can decide if they enjoy the songs by giving a “thumbs up” or “thumbs down”, which gives Pandora feedback on how to make the station better for the user.
To the user, this may seem like magic, but graph theory is hard at work behind the scenes. The Music Genome Project began hiring “musicologists” in 2000 to listen to songs and rate the different attributes of the songs, which can take 20-30 minutes per song. These attributes range from instrumentation musical style to chord and rhythmic patterns. Different genres are rated with different numbers of attributes, with classical and jazz having the most, due to the disparity in performances by two different artists of the same song.
The way in which the Music Genome Project applies to graph theory is the connectedness between different songs. Once the attributes of a song have been rated, they can be compared to that of other songs, to find how similar they are. Songs can then be placed into a network, with each song placed on a node, and edges between similar songs, and rating how strong the tie is between two songs. When a listener chooses a song for a radio station, the Music Genome Project chooses the songs that have the strongest edges to the original song. When a user “thumbs up” a song, this strengthens the tie to the original song. When the opposite occurs, and the user “thumbs down” a song, not only does it remove that specific song from the network for that user (the song will not be played on that radio station again), but the radio station is now less likely to play songs that have strong ties to the “thumbed down” song.
It is interesting to note that the Music Genome Project works in a similar way to the Strong Triadic Closure Property. The Triadic Property states that if Node 1 has a strong tie to Nodes 2 and 3, then Nodes 2 and 3 will at least have a weak link between them. For the Music Genome Project, a user may like Song 1 and Song 2, which are similar (indicating a strong tie between them). If there is a Song 3, which has a strong tie to Song 2, then it will probably have at least a weak tie to Song 1. The Music Genome Project uses this idea to find more songs that the listener might enjoy. In a similar way that there must be a weak tie between two nodes to create a triangle, it is probable that when the listener likes two songs, he or she will also like a song that has at least a weak connection.
By these methods, a radio station can be started with “Don’t Stop Believin'” by Journey (a party classic), and end up playing “House of the Rising Sun” by The Animals, showing how completely different songs can be connected by a path of similar musical traits.
http://computer.howstuffworks.com/internet/basics/pandora.htm
http://arstechnica.com/tech-policy/2011/01/digging-into-pandoras-music-genome-with-musicologist-nolan-gasser/
-Baby Duck