The Limitations of Using Random Graphs to Model Real World Networks
Real world networks were a difficult topic to study in the past, due to challenges in acquiring and analyzing data. However, the growth of the internet and powerful tools to process data have resulted in a rapid growth in networks research, and the ability to study them much more quantitatively. Access to larger networks (social networks, power networks, the world wide web, etc.) have made clear that many previous graph-theory models which were used to approximate networks were incorrect – the most famous of which, the Erdos-Renyi model, failed to replicate the clustering, triadic closure, and hubs seen in real-world networks. As a result, there were two notable models created in an attempt to fix some of the problems Erdos-Renyi had: The Watts-Strogatz model, which generated random-graphs with small-world properties, and the Barabasi-Albert model, which generated scale-free networks through preferential attachment. As it turns out, most large networks in the present day have scale-free structure, or a network structure in which most nodes have few connections, some nodes have a medium number, and a few nodes are very well connected. Using Twitter as an example, most of us may have a few hundred followers, some of us – Andrew Aquino, for example – have a few thousand, and Justin Bieber has millions.
Because of the nature of citation networks and the web, which are constantly evolving, directed graphs, they can be reasonably approximated with the Barabasi-Albert model. For example, we can define all research papers to be nodes, and all of the other papers they cite to be directed edges from the newer paper to the older paper. We can then make several assumptions: first, papers with many citations (many incoming edges, a high degree) are more likely to be cited by newly published papers, as the presence of many citations implies quality. Second, papers published recently are more likely to be cited, since it is less likely a text from 50 years ago is relevant to researchers now. Applying these assumptions to a mathematical model yields a very versatile algorithm for generating citation networks, which can be appropriately modified to model all different types of scale free networks – the creation of more accurate models is an interesting and practical field of research actively pursued now.
http://journals.aps.org/pre/abstract/10.1103/PhysRevE.63.056125