Skip to main content



Maximizing the spread of influence in a Social Network

https://www.cs.cornell.edu/home/kleinber/kdd03-inf.pdf

 

While searching for an intriguing article to write about, I found an interesting paper written by three cornell professors, Professor David Kempe, Jon Kleinberg, and Eva Tardos. The topic of this paper is “Maximizing the Spread of Influence through a Social Network”. In class and in problem sets, we learned about variety of social network graphs, in which we determined the most influential node – usually located at the center of the clusters connected to many other nodes. I found this paper very interesting because it describes a real life application of identifying an influential node.

The paper tackles the following question: which set of nodes in a social network graph should a company target in order to maximize the number of new adopters for a brand new product? The paper compares several different algorithms of selecting a few key nodes for initial stimulation, assuming that the activated key node has some probability of influencing neighboring nodes to be activated as well. This brings a cascade of activation, and the activation stops once there are no more nodes left to be activated.

Intuitively, I thought the most influential nodes are the ones present inside the cluster, which are connected with a lot of other nodes. Stimulating such nodes first would bring the maximized spread of influence in a social network. One of the node targeting method in the paper was based on the “centrality” of the node, which according to my intuition should be the most effective method. However, after simulations, centrality based selection did not perform as well compared to other algorithms described in the paper. This was due to the fact that once a single central node is targeted, it was unnecessary to target other central nodes as they were covered by the same network after a few propagation steps. After the first central node is selected, selecting any other central node was inefficient.

According to the simulations, the most efficient algorithm of targeting initial set of nodes in the paper was the greedy hill-climbing algorithm. This algorithm first selects the node that brings the most increase influence, then it selects the node that brings the second most increase in influence. Then it continues until there are no more increases in influence.

The paper was interesting for me because it took real network structure to analyze an issue that could be applied in the real world. It proved that greedy hill-climbing algorithm is a great approximation for these kinds of problem. I thought maybe applying the Strong Triadic Closure Property that we learned in class could be useful in determining the probability of activating a neighboring node. I assume that targeting a node connected closely to other two nodes would make it more probable for all three nodes to be activated.

Comments

Leave a Reply

Blogging Calendar

September 2018
M T W T F S S
 12
3456789
10111213141516
17181920212223
24252627282930

Archives