Movie Recommendations and Human Decisions
On October 2, 2006, Netflix introduced a competition called “The Netflix Prize”. The purpose of this competition was to better a movie recommendation algorithm. Along the way it revealed several insights about the way humans make decisions.
If you do not already know, Netflix is a burgeoning company that started out in the business of online DVD rentals. One could order DVD’s online and receive them in the mail before sending them back and repeating the process. As you can imagine, with the amount of movies Netflix possessed, users often came to rely upon the recommendations presented, which were found to be insufficiently accurate. Having hit a brick wall in improvement, Netflix decided to crowdsource a solution. They offered a prize of $1,000,000 to any team which could achieve roughly 10% in improvements, as measured by user feedback on movies in the form of a rating system. The contestants also had a huge set of data based on past decision to analyze how their algorithms worked. And thus began a competition that ran more than 3 years.
As you can imagine, this competition encompassed a wide range of areas, from statistics to psychology. It is hard to imagine a solution that did not take into account some of the key concepts of networks as well. For example, the most basic of concepts, the strong triadic closure, can be used to examine recommendations in a social manner. If user 1 likes a series of movies, and user 2 also likes several of the movies in that series, it is likely that a movie user 2 has seen which user 1 has not, would be a good recommendation. Other, more statistical techniques, were used as well. One that relied on user preferences, was that of Singular Value Decomposition. Essentially, a series of descriptors, or categories was created, and the values in each were defined according to the movie in question. From gathering each user’s rating of the movie on the whole, the user’s preference for each specific value in a category could be discovered. For example, in regards to a specific film, a category may be “genre”, and the value may be “action”. If I rate that movie low, it says something about what I think of action movies. Then, based on all my movie ratings and the preference I indicate for each value, a movie with those values I prefer is suggested. Techniques such as these proved to be breakthroughs, often shooting up a team several percentage points in efficacy.
However, it is also important to note the progress of improvements in the competition. While several percentage points were gained towards the %10 threshold within a matter of months, it took a whole year to get from 8.5% to 10%. Why? It appeared that there were a series of films which seemed to run contrary to the algorithms, films such as “Napoleon Dynamite”, “I Heart Huckabees”, “Lost in Translation” and “Sideways.” All are cult classics and more importantly polarizing, love-hate films. Personally, I remember having long arguments about how unfunny Napoleon Dynamite was with a close friend who regarded it as a masterpiece, despite sharing many interests as friends.
I am not surprised that an algorithm has trouble with cases such as Napoleon Dynamite. Looking back, it seemed almost as if Napoleon Dynamite was a societal phenomenon: a trend, if you will. My biased rationalisation of how it came to be significant borrows some ideas from Malcolm Gladwell’s book “The Tipping Point” and the study: “The Strength of Weak Ties.” Somewhere out there was a “maven” whom genuinely enjoyed the movies. By his/her force of personality and ability to create “weak ties” with numerous people who may not have been as involved in the film world, it became a topic of conversation, and the idea of it being a phenomenon became a trend, exposing it to many more people than the usual for films of this nature. It seems obvious that a machine would have difficulty understanding this, especially with such a limited data set which omitted any personal information at all.
I think it stands to say that we have a long way to go in our modeling of human choices and interactions. In my unsubstantiated view, in addition to societal patterns, I think its possible that the psychological makeup of each person is something that may be a crucial factor in something such as a recommendation engine. A questions such as: “Is this person stubborn and likes only what he/she “knows” to be good, or is this person more susceptible to societal trends and peer pressure?” is important and there are many more questions to be answered. Unfortunately, it seems that Netflix will not be involved in the next great algorithmic advance, because due to privacy concerns, they have had to cancel a second planned competition.
Until Next Time,
Shawn
Source: http://www.nytimes.com/2008/11/23/magazine/23Netflix-t.html?pagewanted=all