Skip to main content



Destroying Criminal Networks Using Structural Analysis, PageRank, and Machine Learning

How can we better attack criminal networks? Analysis of such networks are difficult since they are private by nature, and have many potential hidden links even if you knew about some of the members. Researchers from Stanford University conducted an analysis using available criminal network datasets which showed links between individuals/organizations and criminal events to create a strategy for destroying criminal networks. For the study, they used an anonymized large dataset with 1.5 million nodes worldwide and 4 million undirected edges. Compared to other studies, this dataset is much larger and will allow better analysis of criminal networks worldwide.

In order to emulate the existence of hidden links, the researchers created “corrupted networks”, which are existing networks that they remove anywhere from 5-50% of edges from. Then, they used a form of machine learning for link prediction called Gradient Boosting Machine (GBM). The model gave predictions that could be used as edge weightings between people in a network. Then, the pagerank algorithm could be applied to the weighted network and each node would get a score. The pagerank score was used as a “suspiciousness index”, which could be used to target the most useful people in the criminal networks. Those with higher suspiciousness indexes were more likely to be vital to the organization’s operations, and thus, eliminating them would make destroying the networks as a whole faster and more efficient than simply choosing random nodes (people) to remove.

Surprisingly, the researchers found that for each corrupted network they tested, the removal of more edges actually led to increased accuracy in finding hidden links even though the initial graph was more incomplete.

In conclusion, the researchers were able to obtain weighted pagerank scores from the machine learning model, and tested various attacking methods on the network. Both the methods of removing weighted pageranks and removing single index pagerank methods had their disadvantages and advantages, and the best performance in destroying a theoretical criminal network was achieved by combining them into a hybrid method.

This relates to the material we are learning in class, as we learned (roughly) how Google’s PageRank algorithm works. In this study, a similar approach was used to assign suspiciousness indexes instead of search relevance indexes. The study also showed an example of triadic closure, as hidden links were more likely to appear in pairs that are close together in terms of edge distances (when they are adjacent), and hidden links are less likely to appear when two nodes are further apart.

Source

Comments

Leave a Reply

Blogging Calendar

October 2017
M T W T F S S
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

Archives