18 min read
When we look at adtech, much of the focus falls on either of these two places: the advertisers who sell specific products via online advertising, or the data brokers who package and sell our information. And this is good - data brokers in particular pose a unique threat, and need much more attention. However, online ads get delivered via a network of middlemen that automate and streamline the process. These middlemen are effective - for example, when we read about youth in Macedonia making thousands of dollars a month from political misinformation aimed at US audiences, we need to remember that the profits generated by these sites wouldn't happen without the use of adtech (These middlemen are generally described by those in the advertising industry in jargon-heavy prose. For people looking for a high-level background, this post discusses Supply Side Platforms; this post discusses Demand Side Platforms, and this video describes how ads are bought, targeted, and delivered)
In a speech from January 2017, Randall Rothenberg - the President and CEO of the Interactive Advertising Bureau, or IAB - directly acknowledged the effectiveness of these middlemen, and the role that advertising plays in making misinformation profitable:
As an industry, it is our obligation to again step up. But this time, our goal cannot be merely to fix our supply chain. Our objective isn’t to preserve marketing and advertising. When all information becomes suspect – when it’s not just an ad impression that may be fraudulent, but the data, news, and science that undergird society itself – then we must take civic responsibility for our effect on the world.
Who Shares What
A few weeks back, Kris Shaffer and I began talking about getting a clearer understanding about what people read and share on Twitter, and what that looks like across the political spectrum. We started looking at the stories shared - and the sites they are shared from - to get a sense of patterns. Based on patterns observed in the data Kris collected, I created a list of 25 sites from across the political spectrum, ranging from misinformation targeted to progressives, to left leaning, to mainstream media, to right leaning, to hate sites.
- Addicting Info
- Bipartisan Report
- Daily Caller
- Daily Stormer
- Fox News Insider
- Gateway Pundit
- Huffington Post
- New York Times
- Patriot Post
- Project Veritas
- Ralph Retort
- The Atlantic
- The Blaze
- Wall Street Journal
- Washington Post
- White Rabbit Radio
This list of sites is obviously not exhaustive, and the work here is just a start, but from this initial review, some interesting patterns begin to emerge. Over the entire list of 25 sites, just under 500 different ad tracking domains are called. Of all of these adtech companies, over 60 are used on 10 or more sites. Amazon Adsystem is used on 12 of the 25 sites; 18 of the 25 sites use adtech supplied by Yahoo, and 23 of the 25 sites use Doubleclick, from Google.
To collect information about the URLs called when visiting each site, we set up an intercepting proxy - for these tests, I used OWASP ZAP, an open source tool. I browsed using Firefox, and set up a custom profile to use while testing. Before visiting each site, I removed all browsing history, cache, and cookies. To test each site, I visited the home page, a story linked from the home page, and a second story or page on the site, for a total of three pages per site. Then, using reporting functionality built into ZAP, I exported all the URLs called while visiting each site.
This is the same list of sites, sorted by the number of different domains called when visiting each site. At the risk of getting overly technical, a "single domain" looks at the base path, so if a site made a call to "api.doubleclick.com" and "ads.doubleclick.com" that counts as a single domain because of the common base of "doubleclick.com":
- ZeroHedge: 184
- AllenBWest: 183
- Daily Caller: 152
- Gateway Pundit: 142
- The Blaze: 139
- Bipartisan Report: 129
- Huffington Post: 129
- RT: 121
- Alternet: 116
- Breitbart: 87
- Ralph Retort: 82
- New York Times: 78
- Newsmax: 78
- Addicting Info: 75
- Washington Post: 71
- The Atlantic: 68
- Fox News Insider: 55
- Wall Street Journal: 49
- Project Veritas: 20
- Guardian: 18
- Reddit: 16
- White Rabbit Radio: 16
- Daily Stormer: 12
- YouTube: 11
- Patriot Post: 10
The core dataset used for the analysis in this post is available here. At the end of this post, I also include additional details on the trackers used and their affiliations.
When we drill down into individual sites, we see that Daily Stormer and White Rabbit Radio - two far right sites - make use of Doubleclick advertising - owned by Google - to generate ad revenue.
White Rabbit Radio:
It's also worth noting that owners of Daily Stormer and White Rabbit Radio use Google Analytics to understand how people interact with their sites. Daily Stormer and White Rabbit Radio have plenty of company here: Addicting Info, AllenBWest.com, Alternet, Breitbart, Daily Caller, Fox News Insider, the New York Times, Reddit, and the Atlantic - among others - also use this common infrastructure that is provided by - and sends data to - Google. When we look at the web through the lens of adtech, one thing becomes abundantly clear: adtech vendors sell indiscriminately, across the political spectrum and social spectrum. It's also clear that without the complicity of adtech vendors, sites on the political fringe - both right and left - would have far fewer resources.
The broad use of adtech to generate revenue creates a level of interconnectedness and dependency between content providers and the adtech networks that profit from them. Ads allow content providers to make money, but that money means different things - for both ad networks and content providers. The overhead of a four person content farm publishing dishonest clickbait, or of a right wing blog like Gateway Pundit, or of a left wing blog like Alternet, or a right wing site like Breitbart, is very different than that of the Wall Street Journal, or the New York Times, or the Washington Post. Yet, all of these sites use (as just one example) comScore, a data aggregator and ad exchange.
When Breitbart attempts to undermine the credibility of the Washington Post and the New York Times, the services offered by comScore make that profitable.
When Gateway Pundit attempts to undermine the credibility of the Washington Post and the New York Times, the services offered by comScore make that profitable.
When the Times and the Post cover news, they also attempt to generate revenue via the services of comScore. But, as these publishers fire back at one another, comScore - like its brothers in adtech - generates consistent revenue as a result of the crossfire where comScore arms both sides.
The pervasive use of the same adtech provided by the same companies to competing sites and political enemies raises some interesting questions.
- What does it mean that the same individual companies that specialize in data collection, analysis, and reuse are woven into our news and information systems across the ideological spectrum?
- What does it mean when the act of reading news, or engaging in political activism online, is an observed activity?
- What does it mean when legitimate news outlets are reliant on a small number of adtech companies for revenue, and these adtech companies sell to anyone, regardless of whether they traffic in hate, deception, or news?
- Given the higher expenses of doing news well - with editors, paid writers, professional fact checkers - what obligations, if any, does adtech have to police hate speech or propaganda?
Tracking the Trackers
Tracking the companies that profit from selling ads to all sides is complicated by the fact that adtech is highly opaque. If we attempt to visit the URLs that show up in the proxy logs, we are met with a dizzying array of responses, the vast majority of which are completely uninformative. Getting a company name generally requires some or all of these four steps:
Even with these tools, nailing down a specific URL to a specific domain can be time consuming. As an example, the domain adsrvr.org is called in 17 out of the 25 sites we surveyed. Visiting their home page shows nothing.
A Whois lookup indicates that the domain has been registered anonymously via GoDaddy, so there is no company information publicly available for the domain.
However, we can track the IP address of the domain and do a reverse IP address lookup, which indicates three other sites hosted on the same IP address.
A search using the domain names turns up this opt-out page, which confirms that The Trade Desk controls adsrvr.org.
This is ridiculously opaque. You should not need to know how to do a reverse IP address lookup or a Whois search to know what company is collecting information on you - and this is just for one tracker loaded on one site. The Huffington Post loads upwards of 120 trackers; the Daily Caller loads upwards of 150; AllenBWest.com loads over 175. If we estimate very low, and assume that we can identify each tracker in 5 minutes, that still comes out to 600 minutes to identify all the trackers used by the Huffington post, or 875 minutes for AllenBWest.com. The adtech companies that profit from our data and sell access to us as we surf the web use this opacity to further their business interests.
Commonly Used Trackers
As noted earlier, just over 60 trackers/services are used in 10 or more of the 25 sites we surveyed. For this post, I identified the companies involved to get a sense of who some of the bigger players are. To emphasize, the list of 25 sites surveyed is incomplete yet representative. This is the beginning of work that will likely be ongoing, as time allows.
But when we look at the 60 services used most commonly across these 25 different sites, the vast majority of these companies are IAB members. As Randall Rotherberg, the President and CEO of the IAB observed in his speech:
(W)e face a challenge that has boiled over into crisis, perhaps the greatest crisis it is possible to face. For it is a crisis not of our industry, not of our digital media and marketing village, but a crisis of society writ large.
Right now, the status quo in adtech is to sell to all sides, and profit from both the arms race and the battles. While our discourse and news ecosystem remains mired in misinformation, adtech pulls profit.
Adtech profits when we read lies, and adtech allows liars to earn revenue.
Adtech profits when we read hate speech, and adtech allows the people who spread hate to earn revenue.
Adtech profits when places like the Huffington Post convince writers to publish for "exposure," and adtech allows the Huffington Post to generate revenue for these exploitive practices.
Adtech profits when people read traditional news outlets, and adtech allows these news outlets to generate revenue.
It's worth remembering that the impact of ad revenue will vary based on the overhead within an organization. The more a site cuts corners, eliminates editors and fact checkers, or doesn't pay writers, the greater the benefit of revenue generated via adtech. The benefits of adtech tilt the scales toward falsehood, sensationalism, and hate. Adtech in its current form - predicated on online monitoring of consumers, and selling access to user data via ad exchanges - gives a decided advantage to those who are willing to bypass facts in favor of bias, superficiality, or an emotional appeal.
Appendix 1 -Dataset
Full dataset in csv format:https://gist.github.com/billfitzgerald/5965a6009a9b939f4155cffea2fe8170
Appendix 2 - List of third party services, by URL
- doubleclick.net - used in 23 sites.
- google.com - used in 22 sites.
- googleapis.com - used in 22 sites.
- gstatic.com - used in 21 sites.
- google-analytics.com - used in 21 sites.
- googlesyndication.com - used in 21 sites.
- scorecardresearch.com - used in 20 sites.
- facebook.com - used in 19 sites.
- googletagservices.com - used in 19 sites.
- adnxs.com - used in 18 sites.
- demdex.net - used in 18 sites.
- yahoo.com - used in 18 sites.
- twitter.com - used in 17 sites.
- facebook.net - used in 17 sites.
- adsrvr.org - used in 17 sites.
- bluekai.com - used in 17 sites.
- tubemogul.com - used in 17 sites.
- advertising.com - used in 16 sites.
- bidswitch.net - used in 16 sites.
- openx.net - used in 16 sites.
- tidaltv.com - used in 16 sites.
- turn.com - used in 16 sites.
- agkn.com - used in 15 sites.
- casalemedia.com - used in 15 sites.
- rubiconproject.com - used in 15 sites.
- sitescout.com - used in 15 sites.
- tapad.com - used in 15 sites.
- 1rx.io - used in 15 sites.
- 2mdn.net - used in 14 sites.
- moatads.com - used in 14 sites.
- nexac.com - used in 14 sites.
- simpli.fi - used in 14 sites.
- contextweb.com - used in 14 sites.
- crwdcntrl.net - used in 14 sites.
- gwallet.com - used in 14 sites.
- quantserve.com - used in 14 sites.
- rfihub.com - used in 14 sites.
- spotxchange.com - used in 14 sites.
- cloudfront.net - used in 13 sites.
- adap.tv - used in 13 sites.
- addthis.com - used in 13 sites.
- revsci.net - used in 13 sites.
- adtechus.com - used in 12 sites.
- amazon-adsystem.com - used in 12 sites.
- adsymptotic.com - used in 12 sites.
- dotomi.com - used in 12 sites.
- media6degrees.com - used in 12 sites.
- mxptint.net - used in 12 sites.
- chango.com - used in 11 sites.
- eyereturn.com - used in 11 sites.
- bidr.io - used in 11 sites.
- eqads.com - used in 11 sites.
- everesttech.net - used in 11 sites.
- pubmatic.com - used in 11 sites.
- youtube.com - used in 10 sites.
- fbcdn.net - used in 10 sites.
- adhigh.net - used in 10 sites.
- cloudflare.com - used in 10 sites.
- ib-ibi.com - used in 10 sites.
- mathtag.com - used in 10 sites.
- basebanner.com - used in 10 sites.
- eyeviewads.com - used in 10 sites.
- tribalfusion.com - used in 10 sites.
Appendix 3 - List of most used third party services, with additional details
RhythmOne is an IAB member.
AOL/adap.tv is an IAB member.
AddThis is an IAB member
AppNexus is an IAB member.
The Trade Desk is an IAB member
Drawbridge is an IAB member.
AOL is an IAB member.
Advertising.com is an IAB member
Neustar is an IAB member
Amazon is an IAB member.
ConvertMedia is an IAB member
IAB and TRUSTe member
Index Exchange is an IAB member
Rubicon Project is an IAB Member
- Owned by Amazon
- Used primarily as a cdn, so its use will vary widely among sites.
- Not explicitly used for ad networks
- Like Cloudfront, Cloudflare is a CDN
Pulsepoint is an IAB member
Lotame is an IAB member
Adobe is an IAB member, although their Marketing Cloud appears to not be an IAB member.
Conversant is an IAB member
Adobe is an IAB member.
Eyereturn is an IAB member
Eyeview is an IAB member.
Facebook is an IAB member.
The following domains are associated with Google services - some are ad-related, some (like YouTube) both provide a service and tracking. All of these domains - individually - were called on 10 or more of the 25 sites surveyed.
- 2mdn.net (part of Google/Doubleclick)
Additional info on these domains/services:
Google is an IAB member.
RadiumOne is an IAB member
MediaMath is an IAB member.
Dstillery is an IAB member.
Maxpoint is an IAB member.
Datalogix is an IAB member.
OpenX is an IAB member.
Pubmatic is an IAB member.
Quantcast is an IAB member.
AudienceScience is an IAB member.
Rocket Fuel is an IAB member.
Rubicon Project is an IAB Member.
comScore is an IAB member.
Simplifi is an IAB member.
Sitescout is an IAB member.
SpotXchange is an IAB member.
Tapad Inc is an IAB member.
Videology is an IAB member.
Exponential is the company name https://apps.ghostery.com/en/apps/tribal_fusion Exponential is an IAB Member.
Tubemogul is an IAB member.
Turn is an IAB member.
Yahoo is owned by Verizon, and is an IAB member.