Skip to main content



Zipf’s Law, The Internet, and Cascades

http://www.hpl.hp.com/research/idl/papers/ranking/adamicglottometrics.pdf

 

Zipf’s law is a kind of distribution that a large percentage of data follow. Properties following Zipf’s law have a frequency that is a power law of their rank. In fact, any p.d.f. (probability density function) defined purely by a power-law may be converted into a Zipf’s law style equation instead by identifying the ranks with particular quantiles.

Zipf’s law like distributions arise naturally in situations where growth is proportional to existing rank, in situations such as the internet, where a site having very many pages is likely to add more pages than a small site only having a single page. This alone is enough to create a power distribution, but also creates an age dependence. The longer a page has existed the more it would have grown and thus the larger it will be. It turns out that models where different pages have different growth rates based for example on interest level, also lead to power law distributions, while allowing an older page to still grow to a larger size than a previous old page.

The internet can be shown to obey Zipf’s law at many different levels: number of connections between physical computers, and links between computers both follow this distribution.

One of the most interesting points that this paper remarks on regarding Zipf’s law is that networks that have a distribution following Zipf’s law are likely to be more vulnerable to cascades in the network. When studying cascades we saw that for a new product to take over an entire network, it must pass a threshold that will allow it to infiltrate all relatively isolated subgraphs of the network. Dually, in order to remain in the network with out being wiped out by an alternative product, a product must pass a threshold that makes some relatively isolated subgraph of the network stable enough to not switch away from it. The paper argues that viruses in Zipf distributed networks are able to survive without surpassing any threshold of infectivity. This likely has to do with the fact that insular groups can’t really exist in a Zipf like network.

Comments

Leave a Reply

Blogging Calendar

November 2018
M T W T F S S
 1234
567891011
12131415161718
19202122232425
2627282930  

Archives