Connecting Graph Theory and Big Data
Using graph theory, many networks can be described simply by nodes and edges connecting those nodes. Graph theory facilitates much of network analysis, including such algorithm development as the Greedy Algorithm or PageRank Algorithm. Graph theory, an often abstract math usually associated with combinatorics, can also be applied practically to concisely and visually represent a network. One basic analysis using graph theory is the prediction of new edges (relationships, bonds, connections. etc.). The algorithm is simple. Choose the lowest degree node (fewest edges connected to the node) and notice the nodes that are two edge lengths away (or have one node separating). The new edge will be created between the lowest degree node of one component, and the highest degree node of a distance 2 away from the original node.
In the article, “Is Graph Theory the Key to Understanding Big Data?”, Google’s PageRank Algorithm is analyzed through a graph theoretical lens. PageRank works by recommending the page (node) connecting with other pages of high popularity (connectivity numeration or degree), and then the next highest degree node, and so on. Google takes advantage of the idea that a highly connected page would likely be more popular a result, based on connectivity alone. If an article has many other sites to connect to, it must be a popular topic if so many others have written about it. Choosing the articles with the greatest connectivity does indeed overlap well with popularity. The article goes on to address possible applications of graph theory to networks, such as linking web pages visited and products purchased. That potential application could provide an extremely useful piece of “big data.” Applying that data could help shorten shopping time as well as increasing satisfaction. Both of those ideas are highly appealing to retailers, so retailers may consider purchasing this data in order to find that advantage.
Understanding networks can be hugely advantageous, but understanding graph theory is a crucial element of the big picture. As previously mentioned, “big data” has very lucrative information in networks, while algorithms help sort through networks the best. All in all, this article brings to light a very necessary prerequisite in understanding and analyzing networks–understanding graphs.
Source: http://www.wired.com/2014/03/graph-theory-key-understanding-big-data/
