Being Tracked While Learning About Being Tracked

1 min read

It's really good to see Laura Poitras's film on the Art of Dissent over on the New York Times. The film is amazing; earlier this spring, Kashmir Hill had a great writeup on their work, and more.

Two things struck me while visiting the NY Times to watch the film: first, the number of trackers that loaded on the page. Second, the film is preceded by an ad from IBM extolling the virtues of big data in policing.

Just for kicks, I took a screencast of the experience. You can see the trackers stacking up in the lower right hand corner of the screen. The ad plays while the trackers stack up.

So, the process of watching a film about dissent in the age of mass surveillance means exposing data to a range of corporate trackers, and watching an ad extolling the virtues of mass surveillance.

Where The Sidewalk Ends: Wading Through Google's Terms of Service for Education

11 min read

Google Apps for Education has been very popular in K12 and higher ed. The service is free, and Google makes some carefully phrased claims about how Apps for Edu does not show ads to users within the core suite of Apps. These claims are often repeated with less nuance by consultants who have been certified to train schools and districts on using Google Apps. Unfortunately, as is often the case, the reality doesn't live up to the sound bite. In this post, we will examine the loopholes that permit data collected from students with Google Apps accounts to be used for non-educational purposes.

Google has five main issues that complicate absolute claims about what Google does or doesn't do with data collected from people within Google Apps for Edu.

We'll get into more detail in this post, but the tl;dr version runs like this:

Google defines a narrow set of applications as "core" Apps for Edu services. These services are exempt from having ads displayed alongside user content, and from having their data used for "Ads purposes". However, apps outside the core services - like YouTube, Blogger, and Picasa - are not covered by the terms of service that restrict ads. The same is true for integrations of third party apps that can be enabled within the Google Apps admin interface, and then accessed by end users. So, when a person in a Google Apps for Edu environment watches a video on YouTube, writes or reads a post on Blogger, or accesses any third party app enabled via Google Apps, their information is no longer covered under the Google Apps for Education terms.

To put it another way: as soon as a person with a Google Apps for Education account strays outside the opaque and narrowly defined "safe zone" everything they do can be collected, stored, and mined.

So, the next time you hear someone say, "Google apps doesn't use data for advertising" ask them to explain what happens to student data when a student starts in Google apps, and then goes to Blogger, or YouTube, or connects to any third party integration.

Background

Google has been making a concerted effort to improve its privacy practices in education. In early 2014, it came to light that Google was data mining email in education products. This was followed up a few months later by the announcement that Google would no longer display ads in core Google Apps, and would no longer scan emails in Apps for EDU.

This shifted practice appears to be the origin of the claim that "Google doesn't collect any data on students." This post by Tracy Mitrano gives a more detailed overview and background.

There's A Hole In The Bucket

In an earlier post last week, I explored some basic issues with even finding the Google Apps for Edu terms of service. In that post, I also outlined some quick and easy fixes for some of the more basic problems.

One of the problems identified in the earlier post has been fixed in the last week: the link to the page that outlines the core services now actually points to the correct location. The list of apps covered under the core Apps for Edu terms includes Gmail, Calendar, Drive, Hangouts, Sites, Contacts, Groups, and Google Apps Vault.

The list of additional services not included and covered under Edu terms includes Blogger, YouTube, Maps, Custom Search, Picasa, and Web History.

So, if a school using Google Apps for Edu wanted to do a unit on digital citizenship and time management and use Web History as a teaching tool, the only way to do that would be to throw student data into Google's normal terms of service, where student data could be mined and sold.

Additionally, while Google's specific terms for edu state that search data would not be scanned for "Ads purposes" it looks like searches via any custom search appliance would be scanned and mined. I'd love to get clarification from within Google on how data in custom searches is handled.

When the administrator of a Google Apps for Education instance enables non-core services covered by different terms of service, it's not particularly clear to admins that different terms apply.

When end users access these services, they do it under the umbrella of their Google Apps account. From an end user perspective, it doesn't make sense that these services would be under different terms, and the login process does nothing to highlight that users are entering a different part of Google's corner of the web, governed by different rules. We go into additional detail on how this works later in this post.

Integration with Third Party Apps

The issues outlined above for non-Core apps are worse for third party integrations available through the Marketplace.

Third party integrations are enabled by admins within the Google Apps Admin console. Once these apps are enabled, users within the Google Apps domain can access these additional software packages. "Integration" usually starts with single sign on and a common identity between the Google Apps domain and the third party vendor, but it could potentially also cover sharing contacts and other data. It's not always clear and obvious to Google Apps admins that they are creating an environment where learner data is flowing to third party vendors. Additionally, when a learner or teacher accesses an app that has been enabled via Google apps, it feels like part of a unified experience. It's a great user experience, but it's a data privacy nightmare. Because the integration is clean, it feels like part of the same system, which implies that the same rules would be in place.

However, every time a learner accesses a third party app via their Apps for Edu account, their data flows to the third party vendor, and is governed by the terms set by that vendor. Google's rules no longer apply.

Let's Talk About "Ads Purposes"

In their education-specific terms of service, Google makes the following statement about data and ads:

Claim of no ads

1.4 Ads. Google does not serve Ads in the Services or use Customer Data for Ads purposes.

This statement sounds pretty good. Google doesn't serve ads.

However, it's worth remembering that not serving ads is not the same as not processing or mining data. You can mine data, and derive benefit from what you learn in the process, without serving ads. It's also unclear what exactly "Ads purposes" means - it is vague to the point of meaningless. Google could improve this individual issue in two ways. First, they could define exactly what they mean when they say, "Ads purposes." Second, they could define exactly how they process data collected within the core Apps for Edu suite, and how they use that data.

In section 2.2, Google buries a reference to Non-Google Apps Products in the Compliance section (emphasis added):

Non-Google Apps terms

2.2 Compliance. Customer will use the Services in accordance with the Acceptable Use Policy. Google may make new applications, features or functionality for the Services available from time to time, the use of which may be contingent upon Customer's agreement to additional terms. In addition, Google will make other Non-Google Apps Products (beyond the Services) available to Customer and its End Users in accordance with the Non-Google Apps Product Terms and the applicable product-specific Google terms of service. If Customer does not desire to enable any of the Non-Google Apps Products, Customer can enable or disable them at any time through the Admin Console.

By burying the concept of Non-Google Apps Products, Google makes this element of the Apps for Education terms unnecessarily complicated.

In section 16 of the terms, Google lists out nearly fifty separate definitions, including this one:

Link from section 16

"Non-Google Apps Product Terms" means the terms found at the following URL: http://www.google.com/apps/intl/en/terms/additional_services.html, or such other URL as Google may provide from time to time.

So, for those playing along at home, Google starts with an absolute statement in section 1. They undercut that statement in section 2. They then provide the link to the actual terms in section 16, but the link is buried within nearly 50 other definitions.

When we follow the link to the Non-Google Apps Product Terms, the first point finally spells out the condition that allows user data from within Google Apps for Education to leak into more permissive terms of service:

Not covered. At all.

Not Subject to Google Apps Agreement. The Additional Services are not governed by the Google Apps Agreement, but are governed only by the applicable service-specific Google terms of service.

After knitting together related clauses from three different sections of the terms of service, and following a link to a completely separate set of terms, we finally see that the terms make a clear distinction between core Apps for Education, and everything else. However, because all of these apps appear in the Admin Panel of Google Apps for Edu, and in many cases the person administering Google Apps is not the person in charge of vetting terms for Google Apps, this difference is, at best, unclear.

So What Does All This Mean, Again?

We've covered a fair amount of ground in this post, and gotten deep in the weeds in Google's policies. The way the policies are written, it seems like one clear absolute is that ads will not be displayed alongside user content.

It's not entirely clear, however, what Google does do with any data collected from the core apps within Google Apps for Education.

It is also clear that as soon as a student or teachers leaves the narrowly defined limits of core Google apps, their data is up for grabs to be used for advertising, or any other purpose defined in Google's general terms of service. Unless a Google Apps for Education account is set up in an incredibly locked down setup, it's hard to see how learners can avoid - or even know - where their information is going, and the terms under which it is being used.

But the clear takeaway: as soon as a learner strays outside the core Google Apps offerings, their data can be used for a range of non-educational purposes.

Suggested Improvements

There are a range of ways that Google's terms for education could be improved. The suggestions here are the tip of the iceberg, and ONLY address the issues that make it difficult to understand exactly what Google is doing. Once Google has improved the readability and transparency of their terms, we could go into more detail on specific ways that the terms can be improved to protect student privacy.

To improve some of the issues listed here, Google should:

  • Explain exactly how learner data will be scanned within the core Apps for Edu purchases;
  • Extend the education terms of service for all other Google apps that aren't currently covered as part of the Core apps suite. If there are applications that Google owns where this is not possible, they should be removed from the free offering list and treated like any other third party integration;
  • For third party integrations and Google products that use a different terms of service, add a step into the process for Google Apps domain admins that highlights and explains that all end users will be sending data to a third party, to be covered under different terms;
  • On a regular basis (every three to six months?), Google should email an apps report to the purchaser of the domain and all domain admins summarizing the enabled apps, and which ones fall outside Google's core Apps for education. This way, unused apps could be pruned, and in the case of staff turnover, the existing setup could be reviewed. This would also allow domain admins the chance to review privacy policies and terms of enabled apps within the domain.

There are a host of other things that could be done that include editing the terms of service for clarity. However, the issues highlighted in this post provide some easy starting points.

Migrating from Drupal 7 to Known

10 min read

What's Next?

As you can see funnymonkey.com has quite a facelift. When it was realized that FunnyMonkey would be going through a transition Bill and I reviewed what the future of funnymonkey.com would look like. Historically the reason to keep coming back has been Bill's blogging on education and education policy. So the focus would be on something that worked well as a blogging platform. The net was cast wide and we considered many options including; staying with Drupal, migrating to wordpress, laravel, revel, go, etc.

In the end we chose Known. After having met Ben Werdmüller and Erin Jo Richey at Reclaim Your Domain: The UMW Hackathon Known was already on my radar. Besides being great people to talk with and work with, Erin and Ben have a great vision for Known and a solid architecture. Known is built with the ethos of the IndieWeb movement and the POSSE publishing model. The ethos of Known and FunnyMonkey line up pretty closely.

How do we get our content into Known

Okay now we've chosen Known, we have 10 years of content currently in a Drupal 7 site, now what?

After a cursory review the import and export routines within Known appeared to be hardcoded and as far as I could tell not pluggable. That's a minor disappointment (more on this later). At this point it looked like a custom plugin was the way forward. Known plugins are pretty straightforward and looking at the default ones proved to be quite helpful. For instance take a look at Bridgy's Main.php file (found under IdnoPlugins);


    namespace IdnoPlugins\Bridgy {
        use Idno\Common\Plugin;
        class Main extends Plugin {
            function registerPages() {
                \Idno\Core\site()->template()->extendTemplate('account/menu/items', 'bridgy/menu');
                \Idno\Core\site()->addPageHandler('account/bridgy/?','IdnoPlugins\Bridgy\Pages\Account');
            }
        }
    }

That's it for the minimal plugin, just register some pages and templates. Past that there is an expected directory structure where Known will find the registered page handlers and templates. Again, reviewing Bridgy;


Bridgy/=
├── Main.php
├── Pages
│  └── Account.php
├── plugin.ini
└── templates
    └── default
        └── bridgy
            ├── account.tpl.php
            ├── facebook.tpl.php
            ├── menu.tpl.php
            └── twitter.tpl.php

We see that the call to \Idno\Core\site()->addPageHandler() registers a page handler for account/bridgy located in the PHP file IdnoPlugins\Bridgy\Pages\Account. That's the basic structure. I'm covering Bridgy for a couple reasons;

  1. It's simple: It doesn't take much code to constitute a plugin in Known.
  2. It's included: The code I'm about to show you is my first Known code and is largely oneoff since it is a migration and will not have an ongoing use. So using Bridgy is a bit more illuminating as it's fair to say it is likely ideomatic Known code.

Writing a content migration plugin

Caveat: This is not exemplary code and can be improved in many ways. What it does show you is how easy it is to get content from other systems into Known. There many points worth considering for refactoring, such as storing the new ID to old ID association as the content is imported and not outside of the save routine(s). That said, you can find the code we used over here.

I'm going to defer the detailed points of the code with the hopes that the code is commented well enough and easy enough to read. This will instead focus on the overview of the process.

Assumption

  1. The drupal DB will be available during the import routines. For this we just backed up the FunnyMonkey.com db and restored locally on our developement stack.
  2. The drupal files directory will be available during the import routines. These were just rsync'd from the production site into /srv/www/legacy/files.
  3. The migration will proceed in the following order as depicted by dependencies;
    1. Files: Have no requirements
    2. User: Have user profile pictures and require Files
    3. Nodes: Have authors and files associated and thus require the Files and User imports
    4. Comments: Require nodes
  4. The source content is in MySQL
  5. URL rewrites will be created to map all content
  6. Some method to check old content and new content will be necessary for quality checking

Writing our plugin

Registering pages


function registerPages() {
    // Administration page
    \Idno\Core\site()->addPageHandler('admin/drupalmigration','\IdnoPlugins\DrupalMigration\Pages\Admin');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/users','\IdnoPlugins\DrupalMigration\Pages\User');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/nodes','\IdnoPlugins\DrupalMigration\Pages\Node');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/files','\IdnoPlugins\DrupalMigration\Pages\File');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/comments','\IdnoPlugins\DrupalMigration\Pages\Comment');
    \Idno\Core\site()->template()->extendTemplate('admin/menu/items','admin/drupalmigration/menu');
}

In order, we register pages for the following details;

  1. Admin page: this will be our overview where we set our database settings. Arguably this could be omitted and just hardcoded.
  2. User Page: This will be the overview for user import.
  3. Node Page: This will be the overview for node import.
  4. File Page: This will be the overview for file import.
  5. Comment Page: This will be the overview for the comment import.

Then we register a template extension to get our 'DrupalMigration' into the menu. This is just a snippet that extends the existing menu to include our options for the DrupalMigration. Review the contents of

DrupalMigration/templates/default/admin/drupalmigration/menu.tpl.php

to see how this injects our menu options into the default menu.

Implementing a page

I'm only going to cover the process for the File portion as that is our first page and is exemplary of the process for all the other pages (excluding the overview page where the db settings are input). The framework for this file is the following;


  namespace IdnoPlugins\DrupalMigration\Pages {
    class File extends \Idno\Common\Page {
        function getContent() {
        }

        function postContent() {
        }
    }
}

We extend \Idno\Common\Page and implement two processes, one for a GET request and one for a POST request. In the file's getContent() method we ensure that only admins can access this page via $this->adminGatekeeper(); then we proceed to build out some tabular data to give an overview of the files to be imported and their status. We store ongoing migration data inside of Known's site config. Arguably we should have used an external table to manage this and would be especially necessary for larger migrations. The filemap which tracks files we have already imported is stored in \Idno\Core\site()->config()->drupal_migration_file_map. Most of this code consists of building up a data structure which we then pass to our admin/file template.

You can review the template in DrupalMigration/templates/default/admin/file.tpl.php. Again this should be better architected to do more of the logic work inside the getContent() process so that the template is just iterating and outputting and not doing any calculations. That said, our template does do a bit of work to present some URL rewrites for those files that have been migrated so that we can include those in our .htaccess after the migration.

For the postContent()> method we again ensure the user is an admin and then iterate over the files and use our plugin classes methods to handle all of the heavy lifting of getting the files into Known. After we process all of the files we redirect via $this->forward(\Idno\Core\site()->config()->getDisplayURL() . 'admin/drupalmigration/files'); back to the same page so the user can see the results.

Additional details

Hopefully everything so far has been helpful. The code could be used as a starting point for other Drupal site migrations into Known. The constants at the top of the file will need adjustment to appropriately grab your content. Assuming you use the same field names for the SQL queries the rest of the import code should largely work. Outside of those constants at the top the following methods will likely need review & refactoring to meet your needs;

  • getFiles(): This currently includes a bunch of unmanaged files and dummies them up to match the managed files data structure. The list of unmanaged files that should migrate will vary from site to site.
  • addUser(): Hardcodes adding a couple users as admins. This could be omitted. All user accounts have mangled passwords between 68 and 127 in length. The idea here is to require users to set a new password via Known's password reset process
  • rewriteURL(): Can be modified to clean up any garbage content and normalize URLs into one particular format. We opted to switch to relative rather than absolute links so that testing would work fine when we were not on the funnymonkey.com domain. This could also be extended to support rewriting node references to other nodes as well, but we opted to defer to 301 (moved permanently) redirects.
  • rewriteContentLinks(): We rewrite content references using our rewriteURL() process so that we can map files to their new destination and normalize on the same process for all content.

Taxonomy is handled by mapping to hashtags appended to the end of the content. See addNode() for more details.

URL rewrites

In addition to each step in the migration rendering a list of rewrites at the bottom of the import screen, Drupal also uses url_aliases that we need to account for.

The following SQL does that for us, we omit all url_aliases that are not users or nodes.


SELECT CONCAT('RewriteRule "^', alias, '$" "', source, '" [L,R=301]') FROM url_alias where source like '%user%' OR source like '%node%';

Points for improvement in Known

Overall the experience with Known was fantastic and a very refreshing experience working with a system with such a tightly focused use case and quality implementation. That said, the following details were points that I saw as potential opportunities for improvement.

Modular import/export process
Arguably this can be better handled with custom code like we did. However, having a modular import/export process lowers a barrier to collaborate and get content into Known. Perhaps the import/export functionality should itself be a Known plugin. In fairness what is currently there handles other platforms that have a standardized export process, and that's a good first step. Besides, Drupal is far from being in a place to have a standard export routine across various implementations. For Drupal there could be a standard views export template that you can map your content into a views export and then a generic Drupal to Known importer that imports data formatted in a particular as defined by the views template, but that's a Drupal project.
AddAnnotation() doesn't return ID
The other processes and methods for saving other Known content all return the newly created ID when creating new objects. This is really a minor nitpick but it made checking the import routine a bit haphazard and prevented a one-to-one on the URL rewrites. In our case we opted to rewrite to the source document rather than the specific comment. While this loses the direct link it does not break the link in the event anybody had linked to the site externally.

Apple's Plan For Managing COPPA Consent Has A Couple Problems

4 min read

For schools running programs using Apple hardware, Apple has set up a guide to streamline the process of creating an Apple ID for each student.

This guide includes a section on creating accounts for students under the age of 13:

 

Under the Children's Online Privacy Protection Act (COPPA), Apple must obtain verifiable consent by a parent or guardian to our Privacy Policy and Parent Disclosure and Consent notice before an Apple ID can be created for a student under 13 years of age (see apple.com/privacy/parentaldisclosureconsent.pdf). As part of the communications process with parents or guardians, your institution will need to assist Apple in ensuring consent is obtained from a student's parent or guardian.

The instructions quoted above indicate that the school is the broker for arranging parental consent. This is a fairly standard practice among edtech companies (whether or not this is good practice is a different conversation). We will also look at the linked Parental Disclosure Consent doc later in this post.

Apple's guide includes step by step instructions for creating Apple IDs. The first step merits a close reading:

 

Step 1. Prepare Apple ID request. To create new Apple ID accounts, you will need to upload a correctly formatted, comma-separated value (CSV) file containing a list of students who need Apple IDs. To download a template, go to Import Accounts and click Download Account Template. To edit the template, use an application such as Numbers, Microsoft Excel, or another application that can save CSV files. To complete the template, you will need to provide the batch number or batch name, student name, student Apple ID, student date of birth, and the parent or guardian email address for each student request.

To highlight a couple key points, Apple's process requires that every school prepare a text file (CSV stands for comma separated values) with the name, birthdate, and parent contact of every student. Text files are notoriously insecure - anyone who gets this file can access the information within it. So, Apple's recommended method for creating student IDs requires creating a comprehensive list of sensitive student data in one of the least secure formats available.

This post explaining the process goes a step further; they use student Gmail addresses for the Apple ID. Practically, this means that if this file was ever compromised, the people who accessed the file would have student names, dates of birth, parent email, and student email.

In case people wonder why this is a concern: when you store data in an insecure format, you expose your students to greater risk - as happened with these students in New Orleans. And these students in Escondido. And these students in Seattle. And these students in Maine.

By encouraging the use of CSV files to move sensitive student information, Apple encourages insecure data handling practice. It's unacceptable for any age group, but it somehow feels worse for students under the age of 13. The fact that this is accepted as sound practice by edtech staff anywhere is also problematic.

 

Parent Disclosure

The full text of the Parent Disclosure doc is essential reading, but we will highlight a couple sections here. For example, after Apple has parental consent, they are clear that they will collect a range of information from all students.

 

We may collect other information from your student that in some cases has been defined under COPPA as personal information. For example, when your student is signed in with his or her Apple ID, we may collect device identifiers, cookies, IP addresses, geographic locations, and time zones where his or her Apple device is used. We also may collect information regarding your student's activities on, or interaction with our websites, apps, products and services.

Apple states very clearly that they will tie location and behavior to an identity for all students, of all ages.

 

At times Apple may make certain personal information available to strategic partners that work with Apple to provide products and services, or that help Apple market to customers. Personal information will only be shared by Apple to provide or improve our products, services and advertising; it will not be shared with third parties for their marketing purposes.

In this clause, Apple clearly states that they (Apple) will use personal information to improve Apple's advertising and marketing.

According to Apple's published policies and documentation, participating in a school's iPad program requires student data flowing to Apple's marketing and advertising, and encourages sloppy data handling by schools.

, , ,

Wherefore Art Thou, Google Apps For Edu Terms of Service?

2 min read

Trying to find Google's Apps for Education Terms of Service page is akin to spending a weekend unicorn hunting while quaffing cocktails from the Holy Grail.

And please, if I have missed an obvious place where Google's current Terms of Service for Apps for Edu are linked, please tell me. I have spent a foolishy long amount of time trying to nail this down, and I would love to know that I had missed something obvious. The shadow of a PEBKAC looms long, and could easily extend into this examination.

Our quest for current Google Apps for Edu Terms of Service leads from Google's Trust page, to the signup page where a district or school would get Apps for Edu, to the Product list, to the product overview page, to the top-level Google for Education page, to Google search (which leads to outdated terms).

It seems like the most reliable way to see the Terms of Service for Google Apps for Edu is to ask people who are already running Google Apps. It really shouldn't be this complicated.

Google could improve this easily by taking the following basic steps:

  • Add a link to the correct Terms of Service from their Trust page;
  • Add a link to the correct Terms of Service from all product description pages;
  • Add a link to the Privacy Policy and Terms of Service from the signup page for Google Apps;
  • Fix the broken link on their Trust page;
  • Add "Updated" dates to the current terms of service;
  • Add text to outdated policies that are no longer active, and link to current policies.

Future posts will address some of the ways that Google's terms allow student data to leak out and be used outside the Apps for Edu terms of service. However, that is a separate issue from basic transparency. For a company founded on making data on the web more discoverable, the opacity of Google's basic terms should be an easy problem for Google to fix.

, , ,

Notes from the Privacy Presentation at the Portland EdTech Meetup

2 min read

Last night, I had the opportunity to present on privacy issues in educational technology to the Portland EdTech meetup. We had about 45 minutes to talk, which let us scratch the surface. We also had a good mix of vendors, people from higher ed, and people from K12. I'd love to see parents and (gasp) students in the mix at future events. I strongly prefer getting different stakeholders together, as all stakeholders benefit from hearing different perspectives and concerns.

The slides from the presentation are on Google Drive. The presentation is licensed under a Creative Commons Attribution-Share Alike license. Feel free to use any piece of it, and link to this post by way of attribution.

Below, I pulled out links to useful resources from the presentation.

A. Background Info

If You Could Only Read Two Things

US Federal Landscape

B. Implications of Thoughtless EdTech

C. Data Release and Implications

This is FAR FROM COMPREHENSIVE. Rather, these small examples show some of the complexities involved in deidentifying data, and how combining data sets can render some efforts at deidentification meaningless.

D. Good Informational Resources

, ,

More Privacy, All the Time

3 min read

As is very obvious to the three regular readers of the FunnyMonkey blog, we care about privacy. Our work around privacy comes directly from our belief that learner agency and learner control are both essential elements in education, and frequently ignored elements of our educational process. Our commitment to learner agency informs much of the work we do - it's why, in addition to privacy, we care about open content, student-directed portfolios, and empowering people and organizations via open source tools.

Over the last eleven years, as part of our work with FunnyMonkey, we have been able to work on a range of projects covering all of these issues. We have been fortunate to work with some amazing people at some amazing organizations. Although software development is a big part of what we do, we never looked at software as the end goal of any project. Technology isn't neutral, and we always worked with people to make sure that any solution removed barriers to doing good work. If writing code was part of making things better, so be it.

For the last seven years, Jeff Graham has been directly involved in shaping and guiding the work we do. Jeff is a rarity among developers - equally comfortable discussing deployment process, the pros and cons of different open licenses, scalability, security, and the emotions of people as they interact with the software we build. There really isn't much we've done over the last seven years that hasn't been made better by his insights and expertise.

Increasingly, as our work around privacy has ramped up, we have been looking at ways to improve both awareness of privacy, and practice around privacy and security. Many of those ideas have been shared here, on this blog. Now, the time feels right - more people seem to be aware of a broader range of issues related to privacy and data use than at any point in the post-NCLB era.

To further this work, both Jeff and I will be joining Common Sense Media this summer. The team at Common Sense is already doing amazing work around student data privacy, and we are incredibly excited to be able to join them. The time feels right. Over the past few years, privacy was the thing we made time for among the other areas of our work. Now, privacy will be the thing we do. Fun times lay ahead.

, , ,

Avoidable Privacy Snafus: Multiple Sets of Terms for the Same Service

2 min read

If you are building and selling an EdTech product, you can benefit from having people on the sales and marketing teams go through the user-facing interface of your product. As part of this process, you should also have members of these teams read through your terms of service and privacy policies.

This came up in a conversation today about the terms of service and privacy policies of a service called "PaperRater." I have a screengrab of the discussion.

I observed that one of the odd things about the service was that it appeared to have two separate policies: one for the free service, and a second set of terms for the premium service.

The terms for the free service are linked from the main PaperRater site: Terms of Service and Privacy Policy.

The different terms for the premium service are linked from the premium service signup page, and are available here: Terms of Service and Privacy Policy.

This screencast breaks down how this issue is presented to end users.

An easy and obvious fix for this would be to have the links from the premium signup page point back to the privacy policies and terms from the main PaperRater site. At that point, there could be a discussion about how to improve the actual terms with clarity around what terms applied to the free and fee-based service.

Of course, it's also possible that different terms should apply to fee-based and free services. But, if that is the case, then the differences should be made clearly and transparently to end users.

, ,

Putting Policies On GitHub

3 min read

Over the last few years, we have been looking at ways in which privacy policies and data stewardship can be improved. Over that time, one of the issues we have encountered repeatedly is that it is difficult to track how and why policies change over time. This lack of transparency hurts people who want to learn about privacy, and how an application treats student data. It also hurts companies - these decisions should be part of organizational culture, and losing them means losing an opportunity to see how a company has evolved and improved over time.

These issues are addressed via a small, simple change: placing terms of service, privacy policies, and other related policy docs on GitHub. Over on the Clever blog, Mohit Gupta has a great blog post describing how to get this done.

The short version: to get started here, all you need to is create a repository on GitHub that contains your terms. Ideally, use this structure: https://github.com/COMPANY_NAME/policies. Use Clever's terms as an example.

Putting policies on GitHub creates some immediate benefits that will accrue over time.

  • Increase transparency - Terms on GitHub are easy to find.
  • Create an annotated log of changes - Git (the system used on GitHub) is designed to manage changes in a codebase over time. If we apply this to privacy policies, this means that every change to a policy can be created with a corresponding note explaining why the change occurred. Over time, this creates an annotated list explaining every change.
  • Creates an opportunity to close the gap between policy creation and software - Generally, gaps exist within companies between policies and developers. Policies governing the use of software are often created without any contact with the people who develop the actual software. This disconnect can result in policies that have little connection to how the software works, and policies that drift away from the organizational mission. Putting terms on GitHub makes them available in a space that is immediately accessible to developers.
  • Provides clear starting points and best practices for new companies - I have spoken with a lot of new companies who are concerned with getting policies right, but don't know where to start. Placing terms on GitHub creates an easily accessible, very visible starting point: a new company can fork and modify terms. This is no substitute for working with a lawyer who is versed in education, but having ready access to existing terms will help provide a solid foundation.

Putting terms on GitHub is not a panacea - this won't magically fix weak terms. However, making terms of service and policies available to broader audiences in an accessible format will help more people understand how data gets used - and doesn't get used - in software. Creating concrete steps that help companies commit to greater transparency helps shift norms around privacy. Creating tools that help us identify sound practice allows us to improve the conversation around privacy, one facet at a time.

Most importantly, this is something that can be done now.

I'm very happy to say that a group of companies have already committed to getting their policies on GitHub. In the next 1-2 weeks, we will be announcing the "official" launch, and doing some additional outreach. If you want to get your terms onto GitHub and be part of the initial announcement, please get in touch.

, ,

Things I Can't Believe I Need To Say: Yes, Doxxing is Bad

3 min read

In the recent days, people have attempted to justify doxxing. Ironically, a person was doxxed in the name of student privacy. I didn't think that it would be necessary - in the education space - to have a conversation about why doxxing is a very bad idea, but here we are.

I left the text below as a comment but I wanted to post this here as well so I have a copy.

If you are in the education world, or the technology world, please speak up on this issue.

From the comment:

I noticed in reading through this most recent post that you omitted this pretty thorough debunking of both the doxxing angle, and the actual conflict of interest that has been used to justify the doxxing: http://hackeducation.com/2015/03/21/doxxing/

You also left out https://storify.com/adriarichards/telling-my-troll-story-because-kathy-sierra-left-t

You also omitted https://www.schneier.com/blog/archives/2015/01/doxing_as_an_at.html - which documents doxxing going back to 2001. One of the comments on that piece is from a person who was doxxed in 1997.

Which is to say: reality often differs from the story you get told on Google trends.

But more than anything, I am left nearly speechless by this statement: "I will not spend time editing out info from public docs."

Why not? In all the cases you cite, the home address of the subject is completely irrelevant to the story you are trying to tell. Yet, you are willing to expose these people to the potential risk for harm because editing out an address will slow you down?

I've done this work. I've edited docs before - it takes around 30 seconds. Cleaning up docs so you are not exposing personal information is *sound research*. Please, add this into your workflow. It will improve your credibility.

Let's say a student hands in work filled with spelling errors. If their justification for it was, "well, I would have corrected these things, but I didn't have time," - what would your reaction be?

Finally, you also justify doxxing by saying that you only research adults, not children. This is a dangerous shield to use. Many of these adults *have children* - when you dox the parent, you put the child at risk.

Additionally, the original issue here (the social media monitoring) is rooted in expectations of privacy, and the expectation to be free from excessive, unnecessary surveillance. We *all* have those rights. You, me, and the people we disagree with. Adult or child.

Please - reconsider what you are saying in this piece. It is a dangerous escalation. At best, it will result in some Pyrric victories. At worst, someone will get hurt. Badly.

,