Skip to main content



Bayes’ Theorem, Network Effects, and reddit

Link to Schnieder’s Post

This post assumes the reader is familiar with reddit. For those who aren’t, it is explained here and FAQ here.

Todd Schneider’s post begins with a simple question: What are the chances of a reddit post reaching the front page, given that it is on the second, third, or fourth page?

Immediately we can see an application of Bayes’ Theorem and conditional probability, but since there is no set probability of reaching any given page, so the author collected extensive data and extrapolated from it. To determine a post’s rank, reddit uses a fairly complex algorithm, weighting the net votes (that is downvotes subtracted from upvotes) and the age of the post heavily. However, the author’s analysis of the data collected shows that another factor is involved: the “type” of the subreddit. The most popular subreddits are the first type, making up the majority of the front page and then spiking again at page 3. The second type consists of less popular topics, which make up the majority of the second page. The third group contains the rest of the subreddits, making up the bottom of the first page and some of the second and third pages.

This article contains a mixture of analysis of information cascades and conditional probability. The voting on reddit could be an example of an information cascade, as people tend to gather together on sites like these, creating a herd mentality. People will vote depending on the current upvotes, which could start a cascade upwards (or downwards). The interesting point that the author makes is that even if a post is in a cascade, it does not have the same chance as a post from another subreddit, which complicates the Bayesian calculation to find the probability of said post reaching the front page.

Comments

Leave a Reply

Blogging Calendar

November 2014
M T W T F S S
 12
3456789
10111213141516
17181920212223
24252627282930

Archives