Click. Connect. Learn.

All posts in Privacy

Hiding Behind Metadata

In reading through the coverage of the ongoing data collection by the government, one of the ways that people obfuscate the depth of the privacy intrusion is by hiding behind the term "metadata."

David Brooks, in NPR's Week In Politics, provides an example of how jargon is used to obfuscate reality:

I'm somewhat bothered by the secrecy, but I don't feel it's intrusive. Basically, they're running huge amounts of megadata through an algorithm. That feels less intrusive to me than the average TSA search at the airport.

A more accurate description is that a TSA search is more immediate, physical, and obvious, and is intrusive in a palpable way. By describing the government data grab as passing data through an "algorithm," Brooks attemps to create a level of distance between life (real, physical and immediate) and what the government is doing (just some geeks with pocket protectors and lab coats). The phrase "running megadata through an algorithm" is, from a technical place, meaningless. If you're using any computer, from a smartphone to a mainframe, you are running data through an algorithm. It's what computers do. It's how repetitive tasks get automated.

Returning specifically to the specifics of what metadata can show, the Electronic Frontier Foundation has a great post on why metadata matters. They highlight some scenarios that show where, just by examing information about the call, you can easily infer the details of what was discussed:

They know you spoke with an HIV testing service, then your doctor, then your health insurance company in the same hour. But they don't know what was discussed.

They know you received a call from the local NRA office while it was having a campaign against gun legislation, and then called your senators and congressional representatives immediately after. But the content of those calls remains safe from government intrusion.

They know you called a gynecologist, spoke for a half hour, and then called the local Planned Parenthood's number later that day. But nobody knows what you spoke about.

The EFF examples show how basic information about a call can show key details that suggest the contents and nature of the call. However, the dataset the government collects is more multifaceted that what the EFF discusses. Because the government is collecting multiple datasets from different sources, it can cross reference these datasets in more sophisticated ways to draw more specific inferences.

From phone records, the government can set a baseline of "normal" call activity. From simple online activity, such as Facebook likes, people can be profiled. From location records (available from cell phone data), a pattern of movement can be predicted.

To use one example, let's say data analysts set a flag on people who have identified with Tea Party groups to look for deviations in calling frequency. Want to get a sense of what the person making these calls is thinking about? Look at their search history (likely accessible from data provided by Google, Microsoft, and Facebook). Look at any videos they watched (available via data from YouTube). Based on knowledge about their past movement history, see if they went anyplace out of the ordinary. Then, because the government has a list of this person's contacts, run the same analysis on contacts, going two degrees (friends, and friends of friends) out.

If you're tracking a terrorist, this is incredibly useful information. However, what happens when a whistleblower gets redefined as a terrorist? Would David Brooks be comfortable with his "metadata" - and that of his contacts, and of his contact's contacts - getting run through an "algorithm?"

Metadata - on its own, as a single data point - can provide a fair amount of information about what a person is doing. Metadata from multiple sources, cross-referenced, moves us into an increased level of precision. When you hear someone discussing this issue, and they describe the data grab as "just metadata," you have witnessed an act of obfuscation.

How Are Schools Using Apple, Google, Microsoft, and Facebook Explaining Surveillance?

At the risk of stating the obvious, I've been following the news of widespread data collection by the NSA with some interest.

After watching things continue to unfold today - including President Obama's underwhelming defense of the program - these are some random thoughts and questions I have:

  • I'd like banks to get comparable surveillance as civilians.
  • I'd like to see the discussion broadened to include corporate responsibility for just acquiescing to these data requests.
  • Schools that went all in with iPads - how are you explaining to parents that your 21st Century Learning enrolled their children in 21st Century Surveillance?
  • Schools that went all-in with Google Apps or Microsoft EDU - how are you explaining that the benefits of cost savings appear to be offset by passive monitoring of the work within the school?
  • Schools that put a lot of time into building your Facebook presence - how will you explain that, by joining the school community on Facebook, you are also throwing your data into NSA servers?
  • For those of you who spent time analyzing and teaching others the "privacy" settings of Facebook, does this feel like time well spent, considering that - to at least the government and Facebook - there is no such thing as a privacy setting that works as adertised?
  • It sounds like, with Prism, the government outsourced TIA.
  • Given this level of cooperation between government and tech companies, how about we put that spirit of collaboration to work and solve the real problem of veterans waiting years for their benefits? If ever there was a problem that could benefit from good data management, the VA benefit system is it.

And yes, it is unclear how much - if any - student data is getting dropped into the net of data that continues to be given by American companies to the American government. To assume none is an act of willful naiveté that strains credibility.

The one thing I will say for Prism - according to a slide shown in the original piece, the program only costs 20 Million a year to run. 20 million a year, to maintain and update a data store to spy on 300,000,000 people? It is, ironically, an example of efficient government spending. To put that in relative terms, that's only 3 million more than the cost of a single drone.

How Can We Teach Privacy?

Based on recent reports, it sounds like the NSA is regularly collecting data from major phone companies, nine major tech companies, and from credit card companies and ISPs (although it's not clear whether the credit card/ISP data collection is ongoing or intermittent).

The list of companies participating includes many major players within the States; it's difficult to imagine anyone using any technology within the States not using at least two of these companies on a daily basis. Most people likely use more. The list includes:

Prism
  • Facebook;
  • Verizon;
  • Apple;
  • Sprint;
  • AT&T;
  • Skype;
  • Microsoft;
  • Google;
  • YouTube;
  • AOL;
  • Yahoo;
  • Paltalk;

Many of the companies involved have issued carefully worded denials, generally including the phrasing that says that the companies never gave "direct access" to servers, and only shared information when "required by law." However, the phrase "direct access" is so vague as to be meaningless, and the program that allows the data grab could be legal under some interpretations of the 2008 FISA update (pdf download).

Even small amounts of data can have incredible predictive power. A blunt data point such as Facebook "likes" can predict politics, sexual orientation, and be indicative of IQ. Some initial research suggests that "likes" can be indicative of health issues. Anonymized search data can reveal incredibly detailed, troubling information. Small amounts of anonymized data can be used to pinpoint individuals. Access to past location information can lead to precise predictions of where a person will be at any time. The details that companies store - and can therefore share - is pretty amazing.

And, of course, with the data the US government has been collecting, they have an incredible trove of information. From the phone carriers, they have the time and duration of all phone calls. They have the location where these calls were made (possibly from the phone's GPS, and certainly from cell towers). They have the list of who talked with whom. They likely have comparable data from Skype.

A variety of these companies give the government access to friend lists, search histories, browsing histories (think Facebook ads; login "services" from a company on websites), productivity work (Google apps suite, MS productivity tools). From Apple, there is a range of buying patterns and services; if you used the "find my iPhone" service, you've given them some very accurate location data. I'd also imagine App Store browsing habits could be interesting. Getting all of these data points in a single location, where they can be cross referenced, provides an incredibly detailed look at the data footprint created by individuals.

For schools who are teaching media literacy and safe online browsing habits: how do you teach online safety and privacy in a world where there is effectively no privacy setting?

For schools that are launching 1:1 iPad initiatives along with Google Apps: how do you talk about the privacy of student data when the company responsible for safeguarding your student's information could be handing it over to the government? The move to hosted services - such as Google Apps for Education or Microsoft EDU - has been a steady move toward convenience at the cost of privacy. At what point does that cost become too high, or too unpredictable?

The details are still coming out on this story, but the outlines that we have now show that any notion of online privacy needs to be rethought.

On a closing note, why is it that our government and our tech companies can work together to assemble the technology to spy on our entire citizenry, but can't work together to get benefits to the veterans who fought to protect the freedoms our government and corporations are now trampling? Priorities, people. Priorities.

For those wanting to learn more, Bruce Schneier has a good writeup on the details of the spying program, and its implications.

Image Credit: "Prism" taken by viviandnguyen_, published under an Attribution Non Commercial No Derivatives license.

Twibbon Provides A Great Example Of An Awful Privacy Policy

Twibbon is a service that markets itself as a tool to support "your cause, brand or organisation in a variety of ways." Twibbon targets Facebook and Twitter, and provides a small graphic that gets added onto a profile picture. This graphic is a visual way to show support for a ause.

After reading through Twibbon's privacy policy I have one question for organizations that use Twibbon: why do require that your supporters surrender all privacy?

The Twibbon privacy policy is remarkably honest when it descibes what it will collect, as it clearly states that it will get your contact information, your location, and other details related to surveys and "offers" (aka, ads and marketing).

RIP Privacy

What we may collect

We may collect the following information:

  • Contact information including email address.
  • Demographic information such as postcode, preferences and interests.
  • Other information relevant to customer surveys and/or offers.

Additionally, Twibbon clearly states that it will track the web pages you visit, what you search for, any interactions with ads. Twibbon clearly tells you that once you sign up for their service, they will track a significant portion of what you do on the web, and save that data.

When you visit the Site, our servers automatically record information that your browser sends whenever you visit a website. This data may include information such as your IP address, browser type or the domain from which you are visiting, the web-pages you visit, the search terms you use, the location of your ISP and any advertisements on which you click. For most users accessing the Internet from an Internet service provider the IP address will be different every time you log on. We use this data to monitor the use of the Site and of our Service, to gather information about the location of our users and for the Site’s technical administration. We do not associate your IP address with any other personally identifiable information to identify you personally, except in case of violation of the Terms of Service.

Because Twibbon targets both Facebook and Twitter, it has the ability to combine data from both sources - and from Facebook, this likely includes information about all of your friends as well. This gives Twibbon a dataset that combines Facebook info, Twitter info, and info about your behavior wherever you go on the web.

This level of information allows you to be identified and tracked with remarkable accuracy. For example, even with fully anonymized location data, with an adequately large data set, researchers can pinpoint an individual based on just 4 locations. Location data is readily available within the data streams from both Twitter and Facebook.

Additionally, Facebook "likes" are also remarkably accurate at predicting a variety of factors, including religion, substance abuse history, sexual orientation, political stance, and introversion/extroversion. Along the same lines, anonymized search data can reveal many details about the searcher.

But of course, Twibbon doesn't need to worry about the constraint of anonymized data, as they have your exact identity within Twitter, Facebook, and the Web. When you support an organization using Twibbon, you are agreeing to have the portion of your life that you put online recorded, analyzed, and sold. Because math and statistics work, that allows people who buy access to your data to make very accurate predictions and inferences about the portions of your life that don't put online - you know, the part of your life that we reasonably assume to be private.

Image Credit: Image found and reused from Eatwifme's Tumblr

Selling Cheap

December 10th was an interesting day for reports on apps for learning.

The Joan Ganz Cooney Center released a report on the ways in which technology can be used to foster and support improved reading skills among children. The report covers a fair amount of ground, and is worth reading in its entirety. Part of the report included a scan of apps and web sites focused on supporting literacy. From the report:

Digital products aimed at building literacy skills in young children are a significant segment of the market. Yet many of these products may not be providing the educational benefit they claim. Few apps and e-books have information in their descriptions that point to any effectiveness studies to back them up, and most only focus on very basic literacy skills that would not be useful for children who are beginning to learn skills like grammar and storytelling.

Surveillance

Also on December 10th, the FTC released its second report on privacy concerns with children's apps, and this report indicates that people selling apps to kids are still collecting data from kids, and that they are still doing it without informing parents.

Staff examined hundreds of apps for children and looked at disclosures and links on each app’s promotion page in the app store, on the app developer’s website, and within the app. According to the report, “most apps failed to provide any information about the data collected through the app, let alone the type of data collected, the purpose of the collection, and who would obtain access to the data. Even more troubling, the results showed that many of the apps shared certain information with third parties – such as device ID, geolocation, or phone number – without disclosing that fact to parents. Further, a number of apps contained interactive features – such as advertising, the ability to make in-app purchases, and links to social media – without disclosing these features to parents prior to download.”

From the first report, we see that apps designed to support literacy are doing a mediocre job of it. From the second report, we see that the manufacturers of these mediocre learning apps are doing a great job harvesting information without informing their users, or their user's parents. So, even if the kid using the app is having a mediocre learning experience, the manufacturer of the app is still able to use your demographic data to sell ads, and/or raise additional VC money, and/or sell your user data outright.

If your kid is attending a school that is rolling out an iPad program, it's worth asking if they have done a privacy audit on the apps they are using. Ask for the process they have used, and for examples of privacy policies that they found incompatible with the rights of their learners. Ask to see a documented process or rubric that they use to evaluate privacy of apps that they will use in their programs.

If you are rolling out a 1:1 program, what do your privacy audits look like? What steps do you take to ensure that the privacy of your learners is respected? How do you communicate about this to teachers, students, and families?

As adults, we can make decisions about how we want to protect (or not protect) our privacy. But we shouldn't require kids and their families to expose themselves to marketers as a precondition to learning. Additionally, given that some of the more popular apps don't promote higher level thinking, if we are going to sell out privacy as a means to learning via apps, we should at least ensure that we get something worthwhile in exchange for our privacy.

Photo Credit: Lextech, via Nowhere Else

FaceBook Screwed Up User Data Again. And Their Sorry This Time. Really. Kind Of.

So, Facebook changed people's default email addresses so that the facebook.com email address became the default, replacing whatever email the actual human connected to the email address had chosen as their default. And people didn't like that.

Sad Facebook

But Facebook is sorry, kind of, and they really kind of mean it.

A Facebook spokeswoman said Tuesday that "in hindsight" the company probably should have better explained the email switchover.

This is what Facebook does. They screw with your data, and then they apologize.

It's what you sign up for when you join Facebook.

At what point will people finally understand that Facebook cares about the details of what you do, who you do it with, and where you do it, but they don't actually care about you?

And now, I'm looking forward to the next event where Facebook plays fast and loose with user data, and then apologizes. If history is any indicator, we should have another good screwup and another faux-pology before the end of 2012.

Image Credit: Image found at and reused from Study: Too Many Open Facebook Could Make Sad

Mountain Lion, Closed Systems, Privacy, and Device Churn

Some interesting dates from the not-so-distant past:

December, 2009: "Apple has said it rejects 10 percent of submissions for being 'inappropriate,' in some cases because they try to steal personal data".

November, 2011: Apple kicks a security researcher out of its developer program for developing a proof of concept that shows how to exploit a security hole. The best part: the researcher had reported the flaw three weeks earlier.

February, 2012: An approved app, available in the App Store, is caught uploading entire address books (aka, stealing personal data), without user consent or knowledge. This app was never pulled from the App store, and an updated (non-address stealing version) is still available.

Rotten Apple

Apple has done a great job of pairing marketing hype with security through obscurity. Apple has created the appearance of a secure system (trust us! we're the gatekeepers!) but the holes in this system keep reappearing. I'm not saying that other systems are any more or less secure; however, other systems don't attempt to parlay a walled ecosystem into the equivalent of a secure environment. There have been instances of security fixes being delayed as a result of Apple's review process, resulting in users having no alternative to compromised apps, and no knowledge of the compromise.

However, despite these issues, Apple supporters - and especially Apple supporters within education - go to great lengths to describe how satisfied they are with their Apple purchases, and how they are not bothered by the increasingly intertwined way that the Apple ecosystem shuts out alternatives. Concerns about student privacy, and how iTunes accounts are effectively required to use iPads and other Mac products, have died down. People seem to have accepted that school in the 21st century requires paying companies to take over your personal data and usage patterns, and mine them for information.

But really, how many people who have gone deep into Apple could express anything but satisfaction, or even intense excitement? What are the alternatives?

Can you imagine a tech director walking into their boss and saying, "Well, this Apple hardware and software was okay, but with a little hindsight they aren't really necessary for learning, and there are other options that look promising, and might even be cheaper. I'd like to explore some other avenues. Oh, and one last thing: sorry about the several hundred thousand/millions we've spent on that hardware and software, and sorry that a good percentage of our faculty and student creative output is locked into apps that don't work on anything else but Apple stuff."

Of course people that have gone all in with Apple will be delighted with the results. The alternative is admitting that resources were squandered on something that was untested, and proved to be not as awesome as the sales teams/fanboys promised. People who have gone heavily into Apple need for Apple to be the best thing ever, as that reinforces their "vision."

So, when I read about the release of Mountain Lion, and how this is a move to annual release cycles of OS upgrades, and how people will now get the chance to upgrade every year (as opposed to having to upgrade every year), it's a move that makes sense for the direction Apple is heading: toward a fully closed ecosystem where people are pushed into frequent upgrade paths leading to increased device churn.

And learning? No problem. There's got to be an app for that.

But the one thing that doesn't surprise me is the name: Mountain Lion. Mountain lions love sheep.

Image Credit: "Rotten Apple" taken by Vince Wingate, published under an Attribution Share-Alike license.

Social Media and Cooperative Surveillance

So the Bruins won the Super Bowl. Or something like that.

And in the aftermath, people rioted in Vancouver. And in those riots, pictures and videos were taken.

And some people took it upon themselves to identify the rioters.

Stanley Cup

And after the aftermath - with nearly 170 people treated in hospitals, volunteers cleaning up the city, people began to ask questions about surveillance and the role of social media.

In the comments of her post linked above, Alexandra Samuel extends her original thoughts to include the "slippery slope" argument:

I don't see how we can claim to be uncomfortable with mass surveillance -- to fear Big Brother -- but then make exceptions when it's convenient, or feels important. This is a slippery slope and we can't draw too many simple lines -- even a line based on exposing illegal behaviour (as opposed to legal but controversial). Remember that there are places where it's illegal to smoke dope, or criticize the government, or hold hands with someone who is the same gender as you. Do we accept social media surveillance in those contexts?

To start, it's worth pointing out that most slippery slope arguments aren't worth the air required to set them loose. A "slippery slope" argument assumes that we live in a world with moral absolutes, and that making a "wrong" choice plunges us into the abyss of uncertainty and ambiguity.

But with that said, to all those who argue that using social media to identify rioters to the state are engaging in community surveillance/crowdsourcing big brother/engaging in nefarious deeds to further the expansion of the omnipresent nanny state: you are late to the game. That ship has sailed. People are reporting on one another, and have been for years, well before the advent of the social web. Perversely enough, people using Facebook are complicit in building their own Panopticon. And, in using sites like Facebook - where people throw their contact information, their interests, the places they like to go, the people they like and dislike, things they buy, games they play (and how they play them), what they look like, what their friends look like, etc, etc - people leave a broad data trail. Even rough data shows a lot about individuals; more sophisticated datasets allow for more sophisticated predictions.

It would be interesting to look at what could be discerned from a person's datastream on Facebook, combined with the data accessible via the phones and laptops we use, and how close that woud come to supporting the data needed to make the Information Awareness Office a reality.

But to return to the argument of what constitutes an appropriate use for social media, and what level of privacy is reasonable to expect: we need to ground these conversations within the historical reality that people have been disagreeing, behaving badly, attempting to avoid responsibility - and then talking about it - for centuries (as an aside, Augustine would have had an AWESOME twitter feed). Social media just lets us get the word out faster.

And, if you are now concerned about privacy, and the relationship between surveillance, privacy, and the state, there is one thing you can do right now to make it better: stop using Facebook, Foursquare, Twitter, etc, as outreach and communication tools. To use social media is to participate in a continuous act of cooperative surveillance: sometimes we're watching ourselves, sometimes we're watching others, sometimes we're being watched, but the difference between sharing and observing is largely a matter of the side of the window you're on.

For the many self-proclaimed "social media consultants": stop advocating an expanded use of Facebook, Twitter, etc, to the detriment of an organization's primary web site. If you have engaged in such unseemly behavior in the past, it's never too late to admit your mistakes. Just stop repeating them. And if you have been working in social media for more than 15 minutes and are actually surprised by privacy implications, you can always go back to selling cars.

Seriously, though, if you are giving advice to an organization that does social justice work, be very careful of the relationships you encourage them to foster on external social sites. Given Facebook's unclear direction in China, the ease in which apps can access and store user data, the way bugs leak private data, and Facebook's own hamfisted "privacy" efforts (from Beacon to facial recognition and everything in between), encouraging social justice-oriented groups to work on Facebook could be putting people at unnecessary risk.

As we talk about privacy and surveillance, we need to remember that a key difference between a surveillance tool and a tool for individual or collective empowerment is who controls the data, and how that data is used.

Image Credit: "Patrice Bergeron" taken by slidingsideways, published under an Attribution Non-Commercial No Derivatives license.

Google and Data Collection

Last May, Google announced that it had accidentally collected personally identifiable information as part of capturing data for the Street View functionality of Google Maps.

A look at the technical aspects of what was collected, and why, tends to support Google's explanation that this was accidental, and not anywhere near as big a deal as people wanted it to be.

New Camera

Please don't misunderstand - Google has plenty of issues with user privacy, and the ramifications for student privacy as more K-12 schools transition to Google Apps are mind-boggling. But, the kerfuffle over data collected for Street View is overblown.

Moreover, Google appears to be taking steps to mitigate this, and they are candid about their role in the failure, and clear about the steps they are taking to improve it. Other companies with widespread privacy issues (cough cough Facebook cough cough) could learn from how Google is handling this.

Image Credit: Photo "New 'Camera'" taken by Sherman Tan, published under an Attribution license.

Bad Execution As A Feature

A great new feature that comes with the Facebook Groups: any friend can add you to any group, without your permission.

And, it's really easy to impersonate someone!

So, I wonder how long it will take for a teacher to get in trouble for belonging to a group they were added to by a "friend."

I don't know how many more times I'll need to say this, but I'll add this additional time to the pile of others: Facebook is a business, and Facebook only cares about your interests up to the point where they can study them and profit from access to them. That is why they allow you to "connect" with things. Any benefit you receive is purely incidental.

Syndicate content