The Books That We Love
http://www.theatlantic.com/technology/archive/2014/09/the-100-books-that-facebook-users-love/379797/
The above link is an article about a viral Facebook meme that went around recently. The meme involved users posting their top ten books that had “stuck” with them in some form or another. Often, friends were tagged afterwards to then participate in filling out their own list. Facebook Data Science group took the information from this, analyzed it, and posted their findings which the news article sums up.
In addition to a list of the most popular books to appear on user’s top ten, the data scientists go a step further and graph this information into a network. According to the scientists, they looked at “connections between the books, e.g. ‘people who listed X also listed Y’, using pointwise mutual information”. Thus, in their diagram, books are represented by nodes (their size corresponding to frequency of mentioning) and the edges between the nodes represent “an usual number of co-occurrences of the two books in the lists.” The author of the article comments how it seems as though some genres of books form clumps giving the examples of high fantasy, European classics, and American classics.
Overall the data seem fairly thorough. Demographic information on the people collected is available (such as women outnumbered men 3:1, 67% were United States citizens, average age was 37). A decent sample size was examined—over 100,000 people. Some interpretations of the data are given such as a comment on how people who enjoyed classic romance novels tended to like one romantic comedy. The author of the article compares the Facebook data scientists’ data with that of Amazon’s Books of the Decade and comments on reasons behind differences to the lists listing differences in age and the fact that although a book may be popularly purchased, it may not “stick” with the reader.
The article relates to class by displaying a network graph and using the same terminology (nodes, edges) to describe the network. It relates to Dr. Kleinberg’s early lectures of visualizing data by creating a network and using it to see connections that might not be as easily apparent as well as how the network can form some small groups that show how the nodes are related to each other. A main group seems to emerge and there are several bridges that extend from the main group to smaller groups.