Skip to main content

Leading Generators and Your Most Sensitive Queries

The link below leads to an article recently published by the Atlantic, detailing the ways in which “lead generators” are tracking people’s queries on search engines and using them against them. To further explain, many think of online search engines like google to be sites of anonymous queries. This is far from true – if you google “need rent money fast” or “can’t pay rent,” it is likely that this query will help advertising agencies build a profile of you based on all your queries. For low income people, this can be very dangerous – among all the results that Google displays are ads placed by companies that are trying to undermine Google’s policies against predatory financial advertising. These companies are called “lead generators” and can lead people to pages where they are prompted to input sensitive financial information. Many of these payday loan systems are designed to take money from consumers fast at high interest rates and ultimately worsen their situations, which are already quite precarious. These lead generators are under scrutiny from the federal government, and yet not much has been done. And while Google has its policies in place to undermine these companies, it does not provide adequate measures to ensure that they do not circumvent the system. It is simply too easy for fraudsters and lead generators to buy this metadata that helps them build profiles of people who are in trouble.

How can we relate the proliferation of lead generators across the world wide web to concepts covered in Networks? Constructing the Web as a directed graph is useful in illustrating the process that causes users to arrive at leading generators’ web pages after querying something that could be considered, by some natural language processing system, to be sensitive financial information. Perhaps by illustrating the system in this way, we will be able to consider various ways in which large search engines like Google can better enforce their policies against predatory financial advertising. It will be important to consider the term reachability, defined in Chapter 13 of the text as “identifying which nodes are reachable from which others using [directed] paths.” Ultimately, we will see that the first query that the user makes, say, “need fast cash for rent,” will belong to the SCC – the giant strongly connected component in the center of our illustration. However the side popups and advertisements from lead generators, are called OUT nodes – nodes that can be reached from the giant SCC but cannot reach it – they are downstream. The most volatile part of this system are the pages that lead you to enter sensitive financial information. These are called tendrils – they are pages that can reach OUT but cannot be reached from the giant SCC. This is the fundamental problem – search engines like Google cannot monitor those pages that they cannot reach. The “Bow-Tie Structure” allows us to see the ways in which lead generators take advantage of the structure of the web to harm consumers in touch financial situations.

Ultimately, there is no one solution to this problem, and though the bright minds at Google can surely solve it, the incentive for them too is not great. What Google and other search engines need to do is find a way to ensure that users’ querying data is secure and that the “tendril” pages are somehow blocked before the user is able to reach them.


Leave a Reply

Blogging Calendar

November 2015