Researching Political Ads -- A Process, and an Example

13 min read

It might seem like the 2020 elections are a long way away (and in any sane democracy, they would be), but here in the US, we have a solid fourteen months of campaigning ahead of us. This means that we can look forward to fourteen months of attack ads, spurious claims, questionable information -- all of it amplified and spread via Facebook, YouTube, Instagram, Reddit, Snapchat, Telegram, Pinterest, and Twitter, to name a few.

In this post, I break down some steps that anyone can use to uncover how political ads or videos get created by looking at the organizations behind the ad.

The short version of this process:

  • Find organizations
  • Find people
  • Find addresses

Then, look for repetition and overlaps. Later in this post, I'll go into more detail about what these overlaps can potentially indicate.

Like all content I share on this blog, this post is released under a Creative Commons Non-Commercial Attribution Share-Alike license. Please use and adapt the process described here; if you use it in any derivative work please link back to this page.

1. Steps

1. Find out who is running the ad. This can be found via multiple ways, including identifying information at the end of the ad or finding social media posts or a YouTube channel sharing the ad. If an ad or a piece of content cannot immediately be sourced, that's a sign that the content might not be reliable. It's worth highlighting, though, that the clear and obvious presence of a source doesn't mean the content is reliable -- it just means that we know who created it.

2. Visit the web site of the organization or PAC running the ad, if they have one. Look for names of people involved in the project, and/or other partner orgs. While the the lack of clear and obvious disclosure of the organization/people behind a site is a reason for concern, disclosure does not mean that the source is reliable or to be trusted. The organizational affiliation, or people behind the organization, should be understood as sources for additional research.

If, after steps 1 and 2, there is no clear sign of what people or organizations are behind the ad, that can indicate that the ad is pushing unreliable or false information.

3. Do a general search on organization or PAC name. Note any distinct names and/or addresses that come up.

4. Do a search at the FEC web site for the PAC name. Note addresses, and names of key officials in the PAC.

5. Do a focused search on the exact addresses on the FEC web site. Be sure to include suite numbers.

6. Do a focused search on any key names on the FEC web site.

The point of these searches is to find repetition: shared staff across orgs, and a common address across orgs, can suggest coordination.

While follow up research on organizations sharing a common address, or staff shared across multiple orgs, would be needed to help clarify the significance of any overlaps, this triage can help uncover signs of potential coordination between orgs that don't disclose their relationships.

Searching Notes

a. When doing a general search on the web for a PAC name, start with Duck Duck Go and Startpage. Your initial search should put the organization name in quotes. If Duck Duck Go and Startpage don't get you results, then switch to Google. However, because most organizations do white hat or black hat SEO with Google in mind, using other search engines for research can often get better results.

b. When searching the FEC web site, you can often get good results without using the search interface of the FEC web site. To do this, use this structure for your search query:

  • "your precise search string" site:docquery.fec.gov or
  • "your precise search string" site:fec.gov.
  • when searching for an address, split the address and the suite number: "123 Main Street" AND 111 site:fec.gov. Using this syntax will return results at "123 Main Street, Suite 111" or "123 Main Street STE 111" or "123 Main Street # 111". 

I generally use docquery.fec.gov first, as that brings up results that are directly from the filings, but either will work.

Unlike searches across the open web, Google will often return cleaner results than searching within the FEC site.

A note on names of companies and individuals

In this writeup, we will be discussing companies, political groups, politicians, and consultants. Generally, companies, political groups, and politicians will be named in the text of this writeup.

I have reviewed screenshots and obscured names and email addresses that contained names, and in general individuals will not be named in this writeup. However, in some cases, the names of individuals will be accessible and visible via URLs shared in this document. This is a decision that I struggled with, and am still not 100% okay with, but it's hard to both show the process while not showing any potentially identifiable information.

I am not comfortable naming people, even when their names are readily available in the public record via multiple sources (and to be clear, all of the information described here is from publicly available documents). The fact that a person's name can be found via public records doesn't justify pulling a name from a public record and blasting it out. In this specific case, in this specific writeup, I made an intentional effort to not include the names in screenshots or in text. This provides some incremental protection (the names won't visible in this piece via search, for example), while still providing some clear and comprehensible instructions so that anyone can do similar research on their own.

But, for people doing this research on your own: do not be irresponsible with people's names and identities. Naming people can put a target on them, and that is just not right.

And, if anyone who reads my piece uses my work to target a person, you are engaging in reprehensible behavior. Stop. Conducting real research means that you will see real information. If you lack the moral and ethical character to use what you learn responsibly, you have no business being here.

2. Using the steps to analyze an ad

To show how to use this process, I will use the recent attack ad levelled at Representative Ocasio-Cortez during a recent debate among Democratic presidential candidates. Representative Ocasio-Cortez responded to the ad on Twitter:

AOC on attack ad

To start, OpenSecrets has a breakdown of the major funders of the PAC behind the ad. The writeup here doesn't look at the funders; it goes into more detail about the PAC, and ways of researching them. If you are looking for information about specific funders, the post at OpenSecrets is for you.

Step 1: Who is running the ad

In this instance, the group behind the ad is pretty simple to find. A person connected to the group quote tweeted Representative Ocasio-Cortez:

Response to AOC

This leads us to the web site for New Faces GOP PAC at newfacespac.com.

Step 2: visit the web site

The site features information about Elizabeth Heng, who lost the 2018 Congressional race for California District 16. It also features the ad that attacks Representative Ocasio-Cortez.

The site also includes a donation form, and at the bottom of the form we can see a small piece of text: "Secured by Anedot." This text gives us a bookmark.

Anedot embedded form

Many forms on political sites are embedded from third party services, and if we look at the original form we can often get useful information. To find the location of this form, we hit "ctrl-U" to view the page source, and then search the text (using ctrl-F) for "anedot".

This identifies the URL of the form.

Anedot URL

Strip away "?embed=true" from the end of the link, and you can go directly to the original form. In this case, the form gets us an address:

Anedot address

We'll note that for use later.

Step 3: Search for the PAC name

A search for "New Faces GOP" turns up a listing in Bizapedia.

New Faces GOP search

This listing provides three additional names, and two additional addresses: a physical address in Texas, and a PO Box in California.

Bizapedia main page

The Texas address (700 Lavaca St Ste 1401, Austin, TX 78701) is a commonly used listing address for multiple organizations, which is a sign of a registry service. 

Lavaca Bizapedia

The California address (PO Box 5434 Fresno, CA 93755) appears to be used less widely.

Step 4: Do a search at the FEC web site for the PAC name

A search of the FEC web site returns the main page at https://www.fec.gov/data/committee/C00700252/?tab=summary - this page provides a review of fundraising and spending.

Additional details are available from the "Filings" tab at https://www.fec.gov/data/committee/C00700252/?tab=filings

FEC docs list

The Statements of Organization provide an overview of key people in the organization, and of relevant addresses.

The most recent Statement of Organization (shown below) contains the same Fresno PO Box (PO Box 5434 Fresno, CA 93755) found in the Bizapedia listing. The filings also include the name of a treasurer. We will note this name for focused searches later.

FEC filing for New Faces GOP

At the end of Step 4, we have the following information:

  • multiple addresses to investigate;
  • multiple people connected to the PAC;
  • by virtue of having information pulled directly from FEC filings, some confirmation that our information is accurate;

Step 5: Do a focused search on the exact addresses on the FEC web site

For this search, we have three main addresses: the Fresno PO Box; the Austin, TX address; and the Washington DC address.

The Fresno PO Box links primarily to filings for New Faces GOP PAC, and for Elizabeth Heng's failed congressional bid.

FEC search PO Box

The search for the Texas address returns no direct results.

The search for the Washington DC address returns results for multiple different PACs, all connected to the Washington DC address.

FEC search on DD address

The FEC results also include the name of a consulting firm, "9Seven Consulting."

In the spending cycle for 2018, this firm received $156,000 in disclosed payments, per OpenSecrets.

Oddly, a web search for "9Seven Consulting" returns a top hit of a Digital Consulting firm named "Campaign Solutions" that also appears to be the employer of the person listed across multiple PACs connected to the Washington DC address at 499 South Capitol Street SW, Suite 405. These results are consistent across DuckDuckGo and Google.

9Seven search results

A search on that address returns yet another advocacy group.

Prime Advocacy search results

This group claims to specialize in setting up meetings with lawmakers.

By the end of Step 5, we have collected and/or confirmed the following information:

  • we have confirmed that many PACs list the Washington, DC address as their place of business;
  • we have confirmed that at least two political consulting firms list the same Washington, DC address as their place of business
  • we have confirmed that multiple PACs list a key employee that is also part of a digital consulting firm

Step 6. Do a focused search on any key names on the FEC web site

For this search, we will focus on the name that appears across multiple filings. A Google search returns 135 results. Based on a quick scan of names, these PACs appear to be almost exclusively right leaning. Obviously, the results contain some repetition, but there are upwards of 25 unique PACs here. In the screenshot below, the same name appeared on all results; it is obscured for privacy reasons.

Search results with name obscured

Additionally, the same name is connected to an IRS filing connected to George Papadopolous. This filing also uses the same DC address.

Shared name

Based on the results of this search, it appears pretty clear that these PACs were supported by a common process or a common entity. The combination of shared staff on their filings and, in some cases, a shared address, could imply a degree of coordination. Clearly, the DC address is used as at least a mailing address for multiple organizations that have at least some general overlap in political goals.

What Does All This Mean

The information uncovered via this process helps us understand what this ad is, what this ad isn't, and how political content gets generated.

Clearly, the group behind the ad is connected to Republican and right wing political organizing. It is unclear whether or not the shared infrastructure and shared process used to create these PACs indicates any level of cooperation across PACs, or whether the PAC-generating infrastructure is akin to HR outsourcing companies that manage payroll and benefits for smaller companies - but given the overlaps described in this post, a degree of coordination would certainly be possible and straightforward to set up, if it doesn't already exist.

The infrastructure supporting the New Faces GOP PAC seems solid. Based on their FEC filings, the group was formed in March of 2019, and by the end of June had raised over $170,000.00. While this isn't a huge amount of money by the standards of national political campaigns, it's still significant, and this level of access to donors, paired with access to the organizational expertise to manage the PAC, suggests a level of support that would be abnormal for a true grassroots effort.

However, this research just scratches the surface; on the basis of what we've seen here, there are multiple other PACs, people, and addresses that could expand the loose network we are beginning to see here. Political funding and PACs are a rabbit hole, and this research has us at the very beginning, leaning over and peering into the abyss.

But, understanding the ad in this context helps us see that it is one facet of what is likely a larger strategy that uses leaders like Representative Alexandria Ocasio-Cortez as foils to energize the Republican base. The hyperbolic rhetoric used in the ad normalizes overblown claims and irrational appeals in an effort to drown out conversations about policy. PACs can be used to fund a body of content that can help fuel future conversational spikes as needed, and to introduce narratives. Because PACs are so simple to form -- especially when there are consultancies designed that appear to bundle PAC creation with a digital distribution plan -- PACs can be thought of as a form os specialized link farm. https://en.wikipedia.org/wiki/Link_farm Just like link farms, PACs provide a way of spamming the conversation with messages from orgs that can be discarded, and subsequently reborn under a different name.

The message matters, but the message in this case becomes clearer when filtered through the ecosystem of PACs that helped create it.

One final note

The research that fueled this writeup isn't especially time consuming. It took me about 10 minutes of searching. The writeup took a while -- they always do -- but the process of doing a quick triage is very accessible. More importantly, every time you do it, you get better and faster. Also, it's not necessary to review every ad. Just do some - learn the process. By learning this research process, you can both see the forces that help shape (some highly misleading) political advertisements and get a clearer view into the process that allows money to shape politics. We are better able to disrupt and debunk what we understand.

Four Things I Would Recommend for Mycroft.ai

2 min read

Mycroft.ai is an open source voice assistant. It provides functionality that compares with Alexa, Google Home, Siri, etc, but with the potential to avoid the privacy issues of these proprietary systems. Because of the privacy advantages of using an open source system, Mycroft has an opportunity to distinguish itself in ways that would be meaningful, especially within educational settings.

If I was part of the team behind Mycroft.ai, these are four things I would recommend doing as soon as possible (and possibly this work is already in progress -- as I said, I'm not part of the team).

  1. Write a blog post (and/or update documentation) that describes exactly how data are used, paying particular attention to what stays on the device and what data (if any) need to be moved off the device.
  2. Develop curriculum for using Mycroft.ai in K12 STEAM classes, especially focusing on the Raspberry Pi and Linux versions.
  3. Build skills that focus on two areas: learning the technical details required to build skills for Mycroft devices; and a series of equity and social justice resources, perhaps in partnership with Facing History and/or Teaching Tolerance. As an added benefit, the process of building these skills could form the basis of the curriculum for point 2, above.
  4. Get foundation or grant funding to supply schools doing Mycroft development with Mycroft-compatible devices

Voice activated software can be done well without creating unnecessary privacy risks. Large tech companies have a spotty track record -- at best -- of creating consistent, transparent rules about how they protect and respect the privacy of the people using their systems. Many people -- even technologists -- aren't aware of the alternatives. That's both a risk and an opportunity for open source and open hardware initiatives like Mycroft.ai.

Fordham CLIP Study on the Marketplace for Student Data: Thoughts and Reactions

9 min read

A new study was released today from the Fordham Center on Law and Information Policy (Fordham CLIP) on the marketplace for student data. It's a compelling read, and the opening sentence of the abstract provides a clear description of what is to follow:

Student lists are commercially available for purchase on the basis of ethnicity, affluence, religion, lifestyle, awkwardness, and even a perceived or predicted need for family planning services.

The study includes four recommendations that help frame the conversation. I'm including them here as points of reference.

  1. The commercial marketplace for student information should not be a subterranean market. Parents, students, and the general public should be able to reasonably know (i) the identities of student data brokers, (ii) what lists and selects they are selling, and (iii) where the data for student lists and selects derives. A model like the Fair Credit Reporting Act (FCRA) should apply to compilation, sale, and use of student data once outside of schools and FERPA protections. If data brokers are selling information on students based on stereotypes, this should be transparent and subject to parental and public scrutiny.
  2. Brokers of student data should be required to follow reasonable procedures to assure maximum possible accuracy of student data. Parents and emancipated students should be able to gain access to their student data and correct inaccuracies. Student data brokers should be obligated to notify purchasers and other downstream users when previously transferred data is proven inaccurate and these data recipients should be required to correct the inaccuracy.
  3. Parents and emancipated students should be able to opt out of uses of student data for commercial purposes unrelated to education or military recruitment.
  4. When surveys are administered to students through schools, data practices should be transparent, students and families should be informed as to any commercial purposes of surveys before they are administered, and there should be compliance with other obligations under the Protection of Pupil Rights Amendment (PPRA).

The study uses a conservative methodology to identify vendors selling student data, so in practical terms, they are almost certainly under-counting the number of vendors selling student data. One of the vendors selling student data identified in the survey clearly states that they have information on students between 2 and 13:

Our detailed and exhaustive set of student e-mail database has names of students between the ages of 2 and 13.

I am including a screenshot of the page to account for any changes that happen to this page into the future.

Students between 2 and 13

The study details multiple ways that data brokers actively (and in some cases, enthusiastically) exploit youth. One vendor had no qualms about selling a list of 14 and 15 year old girls for targeting around family planning services. The following quotation is from a sales representative responding to an inquiry from a researcher:

I know that your target audience was fourteen and fifteen year old girls for family planning services. I can definitely do the list you’re looking for -- I just have a couple more questions.

The study also highlights that, even for a motivated and informed research team, uncovering details about where data is collected from is often not possible. Companies have no legal obligation to disclose this information, and therefore, they don't. The observations of the research team dovetail with my firsthand experience researching similar issues. Unless there is a clear and undeniable legal reason for a company to disclose a specific piece of information, many companies will stonewall, obfuscate, or outright refuse to be transparent.

The study also emphasizes two of the elephants in the room regarding the privacy of students and youth: both FERPA and COPPA have enormous loopholes, and it's possible to be fully compliant with both laws and still do terrible things that erode privacy. The study covers some high level details, and as I've described in the past, FERPA directory information is valuable information.

The study also highlights the role of state level laws like SOPIPA. SOPIPA-style laws have been passed in multiple states nationwide, starting in California. This might actually feel like progress. However, when one stops and realizes that there have been a grand total of zero sanctions under SOPIPA, it's hard to escape the sense that some regulations are more privacy theater than privacy protection. While a strict count of sanctions under SOPIPA is a blunt measure of effectiveness, the lack of regulatory activity under SOPIPA since the law's passage either indicates that all the problems identified in SOPIPA have been fixed (hah!) or that the impact of the regulation is nonexistent. If a law passes and it's not enforced, what is the impact?

The report also notes that the data collected, shared, and/or sold goes far beyond simple contact information. The report details that one vendor collects information on a range of physical and mental health issues, family history regarding domestic abuse, and immigration status.

One bright spot in the report is that, among the small number of school districts that responded to the researcher's requests for information, none appeared to be selling or sharing student information to advertisers. However, even this bright area is undermined by the small number of districts surveyed, and the fact that some districts took over a year to respond, and with at least one district not responding at all.

The report details the different ways that school-age youth are profiled by data brokers, with their information sold to support targeted advertising. While the report doesn't emphasize this, we need to understand profiling and advertising as separate but unrelated issues. A targeted ad is an indication that profiling is occurring; profiling is an indication that data collection from or about students is occurring -- but we need to address the specific problems of each of these elements distinctly. Advertising, profiling (including combining data from multiple sources), and data collection without clearly obtained informed consent are each distinct problems that should be understood both individually and collectively.

If you work with youth (or, frankly, if you care about the future and want to add a layer of depth to how you understand information literacy) the report should be read multiple times, and shared and discussed with your colleagues. I strongly encourage this as required reading in both teacher training programs, and as back to school reading for educators in the fall of 2018.

But, taking a step back, the implications of this report shine a light on serious holes in how we understand "student" data. The report also demonstrates how the current requirement that a person be able to show a demonstrable harm from misuse of personal information is a sham. Moving forward, we need to refine and revise how we discuss misuse of information.

Many of the problems and abuses arise from systemic and entrenched lack of transparency. As demonstrated in the report:

It is difficult for parents and students to obtain specificity on data sources with an email, a phone call, or an internet search. From the perspective of parents and students, there is no data trail. Likewise, parents and students are generally unable to know how and why certain student lists were compiled or the basis for designating a student as associated with a particular attribute. Despite all of this, student lists are commercially available for purchase on the basis of ethnicity, affluence, religion, lifestyle, awkwardness, and even a perceived or predicted need for family planning services.

This is what information asymetry looks like, and it mirrors multiple other power imbalances that stack the deck against those with less power. As documented in multiple places in the survey, a team of skilled researchers with legal, educational, and technical expertise were not able to pierce the veil of opacity maintained by data brokers and advertisers. It is both unrealistic and unethical to expect a person to be able to demonstrate harm from the use of specific data elements when the companies in a position to do the harm have no requirement to explain anything about their practices, including what data they used and how they obtained it.

But taking an additional step back, the report calls into question what we consider "student" data. The marketplace for data on school age people looks a lot like the market for people who are past the traditional school age: a complete lack of transparency about how the data are gathered, sold, used, and retained. It feels worse with youth because adults are somehow supposed to know better, but this is a fallacy. When we turn 18, or 21, or 35, or 50, we aren't magically given a guidebook about how data brokers and profiling work. The information asymmetry documented in the Fordham report is the same for adults as it is for youth. Both adults and youth face comparable problems, but the injustice of the current systems are more obvious when kids are the target.

Companies collect data about people, and some of the people happen to be students. Possibly, some of these data might have been collected within an educational context. But, even if the edtech industry had airtight privacy and security, multiple other sources for data about youth exist. Between video games, health-related data breaches (which often contain data about youth and families in the breached records), Disney and comparable companies, Equifax, Experian, Transunion, Axciom, Musical.ly, Snapchat, Instagram, Facebook Messenger, parental oversharing on social media, and publicly available data sources, there is no shortage of readily available data about youth, their families, and their demographics. When we pair that with technology companies (both inside and outside edtech) going out of business and liquidating their data as part of the bankruptcy process, the ability to get information about youth and their families is clearly not an issue.

It's more accurate to say data that have been collected on people who are school age. To be very clear, data collected in a learning environment is incredibly sensitive, and deserves strong protections. But drawing a line between "educational" data and everything else misses the point. Non-educational data can be used to do the same types of redlining as educational data. If we claim to care about student privacy, then we need to do a better job with privacy in general.

This is what is at stake when we talk about the need to limit our ISPs from selling our web browsing history, our cellular providers from selling our usage information -- including precise information, in real time, about our location. What we consider student data is tied up in the data trails of their parents, friends, relatives, colleagues -- information about a younger sister is tied to that of her older siblings. Privacy isn't an individual trait. We are all in this together.

Read the study. Share the study. It's important work that helps quantify and clarify issues related to data privacy for adults and youth.

Privacy Postcard: Starbucks Mobile App

2 min read

For more information about Privacy Postcards, read this post.

General Information

App permissions

The Starbucks app has permissions to read your contacts, and to get network location and location from GPS.

Starbucks app permissions

Access contacts

The application permissions indicate that the app can access contacts, and this is reinforced in the privacy policy.

600

Law enforcement

Starbucks terms specify that they will share data if sharing the information is required by law, or if sharing information helps protect Starbuck's rights.

Starbucks law enforcement

Location information and Device IDs

Starbucks can use location as part of a broader user profile.

Starbucks collects location info

Data Combined from External Sources

The terms specify that Starbucks can collect, store, and use information about you from multiple sources, including other companies.

Starbucks data collection

Third Party Collection

The terms state that Starbucks can allow third parties to collect device and location information.

Third party

Social Sharing or Login

The terms state that Starbucks facilitates tracking across multiple services.

Social sharing

Summary of Risk

The Starbucks mobile app has several problematic areas. Individually, they would all be grounds for concern. Collectively, they show a clear lack of regard for the privacy of people who use the Starbucks app. The fact that the service harvests contacts, and harvests location information, and allows selected information to be used by third parties to profile people creates significant privacy risk.

People shouldn't have to sell out their contact list and share their physical location to get a cup of coffee. I love coffee as much as the next person, but avoid the app (and maybe go to a local coffee shop), pay cash, and tip the barista well.

Privacy Postcards, or Poison Pill Privacy

10 min read

NOTE: While this is obvious to most people, I am restating this here for additional emphasis: this is my personal blog, and only represents my personal opinions. In this space, I am only writing for myself. END NOTE.

I am going to begin this post with a shocking, outrageous, hyperbolic statement: privacy policies are difficult to read.

Shocking. I know. Take a moment to pull yourself up from the fainting couch. Even Facebook doesn't read all the necessary terms. Policies are dense, difficult to parse, and in many cases appear to be overwhelming by design.

When evaluating a piece of technology, "regular" people want an answer to one simple question: how will this app or service impact my privacy?

It's a reasonable question, and this process is designed to make it easier to get an answer to that question. When we evaluate the potential privacy risks of a service, good practice can often be undone by a single bad practice, so the art of assessing risk is often the art of searching for the poison pill.

To highight that this process is both not comprehensive and focused on surfacing risks, I'm calling this process Privacy Postcards, or Poison Pill Privacy - it is not designed to be comprehensive, at all. Instead, it is designed to highlight potential problem areas that impact privacy. It's also designed to be straightforward enough that anyone can do this. Various privacy concerns are broken down, and include keywords that can be used to find relevant text in the policies.

To see an example of what this looks like in action, check out this example. The rest of this post explains the rationale behind the process.

If anyone reading this works in K12 education and you want to use this with students as part of media literacy, please let me know. I'd love to support this process, or just hear how it went and how the process could be improved

1. The Process

Application/Service

Collect some general information about the service under evaluation.

  • Name of Service:
  • Android App
  • Privacy Policy url:
  • Policy Effective Date:

App permissions

Pull a screenshot of selected app permissions from the Google Play store. The iOS store from Apple does not support the transparency that is implemented in the Google Play store. If the service being evaluated does not have a mobile app, or only has an iOS version, skip this step.

The listing of app permissions is useful because it highlights some of the information that the service collects. The listing of app permissions is not a complete list of what the service collects, nor does it provide insight into how the information is used, shared, or sold. However, the breakdown of app permissions is a good tool to use to get a snapshot of how well or poorly the service limits data collection to just what is needed to deliver the service.

Access contacts

Accessing contacts from a phone or address book is one way that we can compromise our own privacy, and the privacy of our friends, family, and colleagues. This can be especially true for people who work in jobs where they have access to sensitive information or priviliged information. For example, if a therapist had contact information of patients stored in their phone and that information was harvested by an app, that could potentially compromise the privacy of the therapist's clients.

When looking at if or how contacts are accessed, it's useful to cross-reference what the app permissions tell us against what the privacy policy tells us. For example, if the app permissions state that the app can access contacts and the privacy policy says nothing about how contacts are protected, that's a sign that the privacy policy could have areas that are incomplete and/or inadequate.

Keywords: contact, friend, list, access

Law enforcement

Virtually every service in the US needs to comply with law enforcement requests, should they come in. However, the languaga that a service uses about how they comply with law enforcement requests can tell us a lot about how a service's posture around protecting user privacy.

Additionally, is a service has no language in their terms about how they respond to law enforcement or other legal requests, that can be an indicator that the terms have other areas where the terms are incomplete and/or inadequate.

Keywords: legal, law enforcement, comply

Location information and Device IDs

As individual data elements, both a physical location and a device ID are sensitive pieces of information. It's also worth noting that there are multiple ways to get location information, and different ways of identifying an individual device. The easiest way to get precise location information is via the GPS functionality in mobile devices. However, IP addresses can also be mapped to specific locations, and a string of IP addresses (ie, what someone would get if they connected to a wireless network at their house, a local coffee shop, and a library) can give a sense of someone's movement over time.

Device IDs are unique identifiers, and every phone or tablet has multiple IDs that are unique to the device. Additionally, browser fingerprinting can be used on its own or alongside other IDs to precisely identify an individual.

The combination of a device ID and location provides the grail for data brokers and other trackers, such as advertisers: the ability to tie online and offline behavior to a specific identity. Once a data broker knows that a person with a specific device goes to a set of specific locations, they can use that information to refine what they know about a person. In this way, data collectors build and maintain profiles over time.

Keywords: location, zip, postal, identifier, browser, device, ID, street, address

Data Combined from External Sources

As noted above, if a data broker can use a device ID and location information to tie a person to a location, they can then combine information from external sources to create a more thorough profile about a person, and that person's colleagues, friends, and families.

We can see examples of data recombination in how Experian sorts humans into classes: data recombination helps them identify and distinguish their "Picture Perfect Families" from the "Stock cars and State Parks" and the "Urban Survivors" and the "Small Towns Shallow Pockets".

And yes, the company combining this data and making these classifications is the same company that sold data to an identity thief and was responsible for a breach affecting 15 million people. Data recombination matters, and device identifiers within data sets allow companies to connect disparate data sources into a larger, more coherent profile.

Keywords: combine, enhance, augment, source

Third Party Collection

If a service allows third parties to collect data from users of the service, that creates an opportunity for each of these third parties to get information about people in the ways that we have described above. Third parties can access a range of information (such as device IDs, browser fingerprints, and browsing histories) about users on a service, and frequently, there is no practical way for people using a service to know what third parties are collecting information, or how these third parties will use it.

Additionally, third parties can also combine data from multiple sources.

Keywords: third, third party, external, partner, affiliate

Social Sharing or Login

Social Sharing or Login, when viewed through a privacy lens, should be seen as a specialized form of third party data collection. With social login, however, information about a person can be exchanged between the two services, or taken from one service.

Social login and social sharing features (like the Facebook "like" button, a "Pin it" link, or a "Share on Twitter" link) can send tracking information back to the home sites, even if the share never happens. Solutions like this option from Heise highlight how this privacy issue can be addressed.

Keywords: login, external, social, share, sharing

Education-specific Language

This category only makes sense on services that are used in educational contexts. For services that are only used in a consumer context, this section might be superfluous.

As noted below, I'm including COPPA in the list of keywords here even though COPPA is a consumer law. Because COPPA (in the US) is focused on children under 13, there are times when COPPA connects with educational settings.

Keywords: parent, teacher, student, school, , family, education, FERPA, child, COPPA

Other

Because this list of concerns is incomplete, and there are other problematic areas, we need a place to highlight these concerns if and when they come up. When I use this structure, I will use this section to highlight interesting elements within the terms that don't fit into the other sections.

If, however, there are elements in the other sections that are especially problematic, I probably won't spend the time on this section.

Summary of Risk

This section is used to summarize the types of privacy risks associated with the service. As with this entire process, the goal here is not to be comprehensive. Rather, this section highlights potential risk, and whether those risks are in line with what a service does. IE, if a service collects location information, how is that information both protected from unwarranted use by third parties and used to benefit the user?

2. Closing Notes

At the risk of repeating myself unnecessarily, this process is not intended to be comprehensive.

The only goal here is to streamline the process of identify and describing poison pills buried in privacy policies. This method of evaluation is not thorough. It will not capture every detail. It will even miss problems. But, it will catch a lot of things as well. In a world where nothing is perfect, this process will hopefully prove useful.

The categories listed here all define different ways that data can be collected and used. One of the categories explicitly left out of the Privacy Postcard is data deletion. This is not an oversight; this is an intentional choice. Deletion is not well understood, and actual deletion is easier to do in theory than in practice. This is a longer conversation, but the main reason that I am leaving deletion out of the categories I include here is that data deletion generally doesn't touch any data collected by third party adtech allowed on a service. Because of this, assurances about data deletion can often create more confusion. The remedy to this, of course, is for a service to not use any third party adtech, and to have strict contractual requirements with any third party services (like analytics providers) that restrict data use. Many educational software providers already do this, and it would be great to see this adopted more broadly within the tech industry at large.

The ongoing voyage of MySpace data - sold to an adtech company in 2011, re-sold in 2016, and breached in 2016 - highlights that data that is collected and not deleted can have a long shelf life, completely outside the context in which it was originally collected.

For those who want to use this structure to create your own Privacy Postcards, I have created a skeleton structure on Github. Please, feel free to clone this, copy it, modify it, and make it your own.

Facebook, Cambridge Analytica, Privacy, and Informed Consent

4 min read

There has been a significant amount of coverage and commentary on the new revelations about Cambridge Analytica and Facebook, and how Facebook's default settings were exploited to allow personal information about 50 million people to be exfiltrated from Facebook.

There are a lot of details to this story - if I ever have the time (unlikely), I'd love to write about many of them in more detail. I discussed a few of them in this thread over on Twitter. But as we digest this story, we need to move past the focus on the Trump campaign and Brexit. This story has implications for privacy and our political systems moving forward, and we need to understand them in this broader context.

But for this post, I want to focus on two things that are easy to overlook in this story: informed consent, and how small design decisions that don't respect user privacy allow large numbers of people -- and the systems we rely on -- to be exploited en masse.

The following quote is from a NY Times article - the added emphasis is mine:

Dr. Kogan built his own app and in June 2014 began harvesting data for Cambridge Analytica. The business covered the costs — more than $800,000 — and allowed him to keep a copy for his own research, according to company emails and financial records.

All he divulged to Facebook, and to users in fine print, was that he was collecting information for academic purposes, the social network said. It did not verify his claim. Dr. Kogan declined to provide details of what happened, citing nondisclosure agreements with Facebook and Cambridge Analytica, though he maintained that his program was “a very standard vanilla Facebook app.”

He ultimately provided over 50 million raw profiles to the firm, Mr. Wylie said, a number confirmed by a company email and a former colleague. Of those, roughly 30 million — a number previously reported by The Intercept — contained enough information, including places of residence, that the company could match users to other records and build psychographic profiles. Only about 270,000 users — those who participated in the survey — had consented to having their data harvested.

The first highlighted quotation gets at what passes for informed consent. However, in this case, for people to make informed consent, they had to understand two things, neither of which are obvious or accessible: first, they had to read the terms of service for the app and understand how their information could be used and shared. But second -- and more importantly -- the people who took the quiz needed to understand that by taking the quiz, they were also sharing personal information of all their "friends" on Facebook, as permitted and described in Facebook's terms. This was a clearly documented feature available to app developers that wasn't modified until 2015. I wrote about this privacy flaw in 2009 (as did many other people over the years). But, this was definitely insider knowledge, and the expectation that a person getting paid three dollars to take an online quiz (for the Cambridge Analytica research) would read two sets of dense legalese as part of informed consent is unrealistic.

As reported in the NYT and quoted above, only 270,000 people took the quiz for Cambridge Analytica - yet these 270,000 people exposed 50,000,000 people via their "friends" settings. This is what happens when we fail to design for privacy protections. To state this another way, this is what happens when we design systems to support harvesting information for companies, as opposed to protecting information for users.

Facebook worked as designed here, and this design allowed the uninformed decisions of 270,000 people to create a dataset that potentially undermined our democracy.

Spectre, Meltdown, Adtech, Malware, and Encryption

2 min read

This week has seen a lot of attention paid to Spectre and Meltdown, and justifiably so. Get the technical details here: https://spectreattack.com/

These issues are potentially catastrophic for cloud providers (see the details in the articles linked above) but they can also affect regular users on the web. While there are updates available for browsers that mitigate the attack, and updates available for most major operating systems, updates only work when they are applied, which means that we will almost certainly see vulnerable systems into the foreseeable future.

I was very happy to see both Nicholas Weaver and Zeynep Tufekci addressing the connection between these vulnerabilities and adtech. 

Adtech leaves all internet users exposed to malware - it has for a while, and, in its current form, adtech exposes us to unneeded risk (as well as compromising our privacy). This risk is increased because many commonly used adtech providers do not support or require encryption.

To examine traffic over the web, use an open source tool like OWASP ZAP. If you are running a Linux machine, routing traffic through your computer and OWASP ZAP is pretty straightforward if you set your computer up to act as an access point

But, using these basic tools, it's simple to see how widespread the issue of unencrypted adtech actually is, in both web sites and mobile applications (on a related note, some mobile apps actually get their content via an unencrypted zip file. You read that correctly - the expected payload is an unencrypted zip file. That's a topic for a different post, and I'm not naming names, but the fact that this is accepted behavior within app stores in 2018 should raise some serious questions).

The unencrypted adtech includes javascript sent to the browser or the device. Because this javascript is sent unencrypted over the network, intercepting it and modifying it would be pretty straightforward, which exposes people to increased risk. 

The next time you are in a coffee shop and see a kid playing a game on their parent's device while the parent talks with a friend, ask yourself: is that kid playing an online game, or downloading malware, or both? Because so much adtech is sent unencrypted, anything is possible.

AdTech, the New York Times, and Normalizing Fascism

4 min read

Today, the New York Times published a piece that normalized and humanized Nazis. I'm not linking to it; feel free to find it on your own, but I'm not giving it any additional traffic.

As was noted on Twitter, Hitler also received flattering press. 

However, because the NYT puts ads on its online content, they make money even when they put out dangerously irresponsible journalism. Due to the perversities of human nature, they probably make more money on the dangerously irresponsible pieces.

But, so does the adtech industry that the NYT uses to target ads. The adtech industry collects and combines information from everyone who visits this page. They aren't as visible as the NY Times, because they operate in the background, but adtech companies are happy to profit while the rest of us suffer. 

Here are some of the companies that had ads placed alongside a disgusting story that normalizes Nazis, fascism, racism, and everything that is wrong with how we are currently failing to address hate in our country. To state what should be obvious, none of these brands chose to advertise on this specific story. However, their presence alongside this story gives an explicit stamp of approval to prose that normalizes fascism. We'll return to this at the end of the piece.

Citi wants us all to know that, provided you use a Citi card, they are okay with Nazis.

Fascist literature and firepits

You know what goes great with fascist literature? A firepit from ultimatepatio.com

sig heil and grills

If your ex-band mate whitewashes fascism as being "proud of their heritage" why not invite them over for a cookout? BBQGuys.com thinks grilling with Nazis is great.

Hitler and Mussolini and toys

Better make room for some gifts from Fat Brain Toys alsongside your Hitler and Mussolini literature.

T-Mobile, your network for Nazis

And if you're calling your Nazi friends, T Mobile wants you to do it on their network.

Nazis and firepits

And, Starfire Direct wants that special Nazi to Reignite Your Life.

As I said earlier, none of these companies chose to advertise on this page. But, because of the lazy mechanism that is modern adtech, "mistakes" like this happen all the time. These "mistakes" rely on an industry that operates in the shadows, makes vague promises to do better, and is predicated on constant monitoring of the web sites we visit, the places we shop (including using facial recognition), and connecting online and offline behavior. Despite industry promises of how this near-constant tracking doesn't impact our privacy, a recent study highlighted that, using AdTech as it was designed, it takes about $1,000 US to identify an individual using mobile ads.

But returning to the NY Times piece that normalizes fascism in the United States and the advertising technology used by the NY Times and just about every other publisher out there: sloppy journalism and lazy advertising both take shortcuts that we can't afford.

Modern adtech is appealing because, for brands, it offers ease of access alongside the promise of precisely targeting users. Sloppy journalism is "appealing" because it looks suspiciously like the real thing, and -- due to how adtech works -- it can ring up revenue in the short term.

But, given where we are, we need to stop looking for shortcuts. Doing things well feels like more work as we get started, but we are currently experiencing what happens in our information and political systems when we take shortcuts.

End Note: The article had tracking calls from over 50 different adtech companies, which is actually on the average to low side of other mainstream news sites. The adtech companies used by NY Times include most of the usual suspects, including Facebook, Google/Doubleclick, LinkedIn, Moat, Twitter, Amazon, AppNexus, Media.net, Bluekai, and AddThis.

Daily Post, October 24, 2017

5 min read

It's been a busy few days, but here are some of the things I've been reading. Enjoy!

Open Source Code from ProPublica to Detect Political Ads

While the lawyers at major tech companies complain that it's too hard to find political ads, ProPublica released code showing how easy it is to identify political ads..

We're asking our readers to use this extension when they are browsing Facebook. While they are on Facebook a background script runs to collect ads they see. The extension shows those ads to users and asks them to decide whether or not a particular ad is political. Serverside, we use those ratings to train a naive bayes classifier that then automatically rates the other ads we've collected. The extension also asks the server for the most recent ads that the classifier thinks are political so that users can see political ads they haven't seen. We're careful to protect our user's privacy by not sending identifying information to our backend server.

Adtech won't fix this problem. They have a financial interest in not fixing this problem. Every day that passes without a fix for this problem is another day they make money from undermining our democracy. I also doubt the ability of our current crop of lawmakers to understand the problem, or understand a good solution.

BlockBear

Blockbear is an ad blocker for iOS, made by the same folks that make TunnelBear VPN.

A really simple, often adorable adblocker for your iPhone or iPad.

  • Blocks ads and invasive online tracking
  • Load many websites 3-5 times faster
  • Whitelist your favorite websites
  • Has bears

You could download another adblocker, but then you wouldn't have a bear!

While I haven't used this, it looks interesting.

Obfuscation Workshop Report

The report from the Inernational Workshop on Obfuscation is now released and available for download.

We have asked our panelists to each provide a brief essay summarizing their project, concept, application—with an emphasis on the questions, challenges, and discussions raised during the weekend. As with the workshop itself, this report is a starting point rather than an end point.

I haven't read this yet, so have little to say on the contents, but obfuscation is one of many tools we have to protect our privacy, and make the data collected about us less useful.

China's "Social Credit" System

China is rolling out a system that publicly measures every citizen. Thought experiment: how much more data would a country need besides what Facebook or Google already collect to create a similar system?

Imagine a world where many of your daily activities were constantly monitored and evaluated: what you buy at the shops and online; where you are at any given time; who your friends are and how you interact with them; how many hours you spend watching content or playing video games; and what bills and taxes you pay (or not). It's not hard to picture, because most of that already happens, thanks to all those data-collecting behemoths like Google, Facebook and Instagram or health-tracking apps such as Fitbit. But now imagine a system where all these behaviours are rated as either positive or negative and distilled into a single number, according to rules set by the government. That would create your Citizen Score and it would tell everyone whether or not you were trustworthy. Plus, your rating would be publicly ranked against that of the entire population and used to determine your eligibility for a mortgage or a job, where your children can go to school - or even just your chances of getting a date.

This is what data does, very well. Data supports systems that rate, rank, sort, all day long. This is not a neutral activity. Anyone who claims otherwise is not adequately informed.

Can We All Just Encrypt Our Stuff Already?

Troy Hunt lays out a clear roadmap for implementing encryption on a web site.

Well, it can be more difficult but it can also be fundamentally simple. In this post I want to detail the 6-step "Happy Path", that is the fastest, easiest way you can get HTTPS up and running right.

This change is coming, so please, just do this. Now. Please.

For $1000 You Can Track Someone Via Adtech

The research in this paper shows how the core features of an ad network can be used to track an individual.

There is a fundamental tension at work in the online advertising ecosystem: the precision targeting features we used for these attacks have been developed for legitimate business purposes. Advertisers are incentivized to provide more highly targeted ads, but each increase in targeting precision inherently increases ADINT capabilities.

This is how data tracking works. Data allows us to ask questions. The researchers in this study didn't exploit a bug. They used the advertising systems exactly as they were designed. This technicque would almost certainly work to target children.

Facebook Tests Gouging Publishers

Facebook can spin this effort to gouge publishers in a few ways, but their move to pull all non-sponsored posts from user's feeds would force publishers to pay Facebook in order to reach people.

A new system being trialled in six countries including Slovakia, Serbia and Sri Lanka sees almost all non-promoted posts shifted over to a secondary feed, leaving the main feed focused entirely on original content from friends, and adverts.

Facebook might even try and spin this as an effort to combat misinformation, but this move really demonstrates what the "meritocracy" looks like in Silicon Valley: if you want access, pay the people who control it. For any publishers who had any illusions about how Facebook views them, this move should dispel all doubts. It's also worth noting where Facebook rolled this test out: smaller countries with, presumably, a userbase with fewer connections.

Daily Post - October 18, 2017

4 min read

Some of the articles and news that crossed my desk on )ctober 18, 2017. Enjoy!

Facebook and Google Worked with Racist Campaigns, at Home and Abroad

Both Facebook and Google worked closely with an ad agency running blatantly racist ads during the 2016 campaign. Both companies worked on targeting more precisely, and provided a range of technical support.

Facebook advertising salespeople, creative advisers and technical experts competed with sales staff from Alphabet Inc.’s Google for millions in ad dollars from Secure America Now, the conservative, nonprofit advocacy group whose campaign included a mix of anti-Hillary Clinton and anti-Islam messages, the people said.

Facebook also worked with at least one campaign putting racist ads in Germany to target German voters. This is what the "neutrality" of tech looks like: racism with money behind it is always welcome. The data collection and subsequent profiling of people is a central element of how racism is spread, and how data brokers and advertising companies work together to profit.

Russia Recruited Activists to Stage Protests

The people who were recruited didn't know they were working with Russians. But this is an odd corner of Russian attempts to create noise and conflict around issues related to race.

Russia’s most infamous troll farm recruited US activists to help stage protests and organize self-defense classes in black communities as part of an effort to sow divisions in US society ahead of the 2016 election and well into 2017.

As always, research your funders and contacts.

US Government Wants the Right to Access Any Data Stored Anywhere

The US Supreme Court will hear a case that looks at whether a legal court order can compel a company to hand over information, even if that information is stored outside the US.

In its appeal to the high court, meanwhile, the US government said that the US tech sector should turn over any information requested with a valid court warrant. It doesn't matter where the data is hosted, the government argues. What matters, the authorities maintain, is whether the data can be accessed from within the United States.

This has the potential to open the floodgates for personal data to be accessed regardless of where it is stored. This would also gut privacy laws outside the US (or create a legal mess that will take years to untangle, and make lawyers very rich). It will also kills the tech economy and isolate the US, because who outside the US would want to connect to a mess like that?

For $1000 US, You Can Use AdTech to Track and Identify an Individual

A research team spent $1000 with an ad network, and used that to track an individual's location via targeted ads.

An advertising-savvy spy, they've shown, can spend just a grand to track a target's location with disturbing precision, learn details about them like their demographics and what apps they have installed on their phone, or correlate that information to make even more sensitive discoveries—say, that a certain twentysomething man has a gay dating app installed on his phone and lives at a certain address, that someone sitting next to the spy at a Starbucks took a certain route after leaving the coffee shop, or that a spy's spouse has visited a particular friend's home or business.

The researches didn't exploit any bugs in mobile ad networks. They used them as designed. So, aspiring stalkers, abusers, blackmailers, home invaders, or nosy creeps: rest easy. If you have $1000 US, AdTech has your back.

Watches Designed for Helicopter Parents Have Multiple Security and Privacy Issues. Cue Surprise

In what should surprise absolutely no one, it looks like spyware designed for the hypervigilant and short-sighted parent have multiple security flaws that expose kids to focused risk.

Together with the security firm Mnemonic, the Norwegian Consumer Council tested several smartwatches for children. Our findings are alarming. We discovered significant security flaws, unreliable safety features and a lack of consumer protection.

Surveillance isn't caring. I completely understand that raising a kid can be petrifying, but when we substitute technology for communication, we create both unintended consequences and multiple other points of potential failure.