Skip to main content

Penetrating Deep Web Networks

One of the disturbing things about the Internet is that normal search patterns can be blind to much of the content–known as the “Deep Web”–that’s really out there. The traditional analogy is that of the ocean, in which a search engine’s hyperlink crawling technique is likened to dragging a net across the surface. There is much buried in the depths of scripted or dynamically generated sites, password-protected domains, and unlinked content that traditional web search simply cannot access. The problem is, much of what’s down there consists of the Internet presence of groups involved in unsavory pursuits across the globe.

This article, published in the communications journal of the Association for Computing Machinery, titled “The Topology of Dark Networks,” discusses some actual characteristics of four individual networks describing terrorist groups, drug trafficking, and gangs. These characteristics include descriptors like average degree, average cluster coefficient, and link density. It also continues to discuss the robustness of the computer-based networks among these in terms of bridge and node-based attack simulations. This struck me as relevant to the material discussed by Professor Tardos in lecture on Nov. 9th, in which she talked about information cascades on networks.

If one modifies the methodology Prof. Tardos discussed for modelling product adoption or viral marketing and uses data like those gathered on computer and non-computer networks in this article, one can go beyond predicting the relative success of node or bridge attacks and can create an informative model of the spread of a virus or piece of spyware. This could be particularly useful for designing an investigation or an attack on such a network or component, since these networks are not necessarily set up with pathway efficiency or connectedness in mind. Specifically, one can look at the data on degree and link density to predict components of the network with density q and a model node, v, of the network. p would describe the fraction of   v’s neighbors containing an infected file. One would analyze the link structure of  sample target nodes throughout network to determine hypothetical values for p with various sets of “initial adopters”, and by testing q against 1-p could determine whether or not components exist sufficiently dense to halt the cascade.  Furthermore, using information about specific nodes, one could predict the location of such components in order to select the optimal targets and nature of a viral attack for maximum effect.


Leave a Reply

Blogging Calendar

November 2011
« Oct   Aug »