Skip to main content



The Unbalanced Popularity of Hashtags

On the popular microblogging site Twitter, hashtags (indicated by a ‘#’ before a word) can be used to easily group together similar 140-character messages. Hashtags generally follow trending news events and popular conversation topics. The popularity of different hashtags can be explored using basic principles of popularity across networks driven by information cascades.

The choice of a hashtag for a particular message is not an independent decision, or the set of hashtags used for certain topics would differ widely. Rather, decisions are driven by copying, allowing new tweets to engage in a group discussion of the topic with all others using the same hashtag. This is the reason grouping messages by topic and tracking trends is possible on Twitter. As an example, say I wanted to tweet a message about Barrack Obama’s campaign for presidency. I could just independently choose some relevant hashtag (say #ObamaCampaign), and there might be some others using the same hashtag. But more likely, I would either look at trending hashtags related to Obama’s campaign, or look over my friends’ tweets, and determine that the vast majority have used #Obama2012. I would then include #Obama2012 in my tweet, joining the discussion of Obama by copying an already popular hashtag.

The result of this hashtag grouping is that a relatively small number of hashtags are in widespread use at any given time. Because tweet authors generally don’t pick a hashtag completely randomly or independently, a few key hashtags in a certain period tend to get more and more popular. New tweets preferentially attach their tweets to other tweets with high popularity hashtags. Hence, before any Obama related hashtags became popular, it would have been hard to predict which one would win out, and tweets would probably have been spread fairly evenly over a number of different hashtags. But as the topic grew in popularity, one (#Obama2012) gained noticeably more use, causing future tweeters to follow this trend.

Although I don’t have the numbers for every tweet and hashtag used in the last month, if I did, I could work out a power law equation for the popularity of hashtags. With k being the number of tweets using a certain hashtag, f(k) (the fraction of hashtags related to k tweets) could be represented as by the power law equation commonly found in studies of popularity: f(k)=1/k^c. c is a constant related to the rarity of popular hashtags that could be determined by looking closely at the data.

Twitter is an environment where popularity for certain hashtags can grow (and fall) very quickly. Unlike some other forms of popularity, most hashtags rise in a brief peak of popularity, and then drop off (Lehmann et al). Depending on the type of topic, some hashtags begin to grow in popularity before the event, some unexpected events lead to a spike followed by a drop-off, and some growths are basically symmetric. For all types, tweets tend to group around the most popular hashtags, signaling the collective attention of a crowd to an event, idea, or topic (a dynamic class).

Sources:

http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch18.pdf

http://arxiv.org/pdf/1111.1896.pdf

 

ebonadonna

Comments

Leave a Reply

Blogging Calendar

November 2012
M T W T F S S
 1234
567891011
12131415161718
19202122232425
2627282930  

Archives