## Social Utility Through Linear Algebra

The well-curated Facebook social graph provides us with structured information about social connections between people like never before in history. However, while the graph data has already provided social utility to Facebook users around the world, especially with the release of the Graph Search tool, there is still a large amount of utility hidden deep in the structure of the graph, some of which we hope to derive through more detailed analysis. In this post, we will discuss one such secondary feature of the social graph and how we can compute such characteristics easily.

The Facebook social graph is fully connected, with 99.1% of users belonging to a single large connected component [1]. However, a given individual user’s view of the social graph is typically divided into clusters of other people – large subsets of users with a high degree of connectivity. For example, a Cornell student may have a large cluster of friends from Cornell, a cluster of friends from their high school, a cluster of acquaintances from their summer internship, and so on. It is always a nice surprise for someone to find the rare edge between two of these clusters, for example to make the discovery that someone from high school in New York is friends with someone met in the Bay Area over the summer. Computing this information using the Facebook social graph can provide additional social utility to its users.

These “surprising” mutual friends corresponding to edges spanning clusters correspond roughly to the sparse cuts in a user’s social graph, or partitions of the social graph that minimize the ratio of the number of edges spanning the partitions to the number of vertices in the smaller half. While this problem is NP-hard [2], we can consider efficient approximation algorithms that can be used to detect sparse cuts within a certain degree of tolerance. This corresponds to finding a quantity known as the “expansion” of a graph, and we can use Cheeger’s inequality to bound the expansion of the graph based on the eigenvalues of the normalized Laplacian matrix of the graph [3].

Once we have a bound on the expansion of the graph, we can fine-tune well-known approximation algorithms to the sparsest cut problem to detect sparse cuts with a decreased tolerance level. Additionally, researchers have discovered the result that whether the eigenvalues of the Laplacian are small or large indicates precisely how the vertices can be divided into subsets defining sparse cuts in the graph [4], thus solving the problem of computationally finding surprising mutual friends.