Report questions why colleges consider high school disciplinary records | InsideHigherEd

Colleges appear to be using inaccurate and arbitrarily collected/shared data in their admissions process.

Where The Sidewalk Ends: Wading Through Google's Terms of Service for Education

11 min read

Google Apps for Education has been very popular in K12 and higher ed. The service is free, and Google makes some carefully phrased claims about how Apps for Edu does not show ads to users within the core suite of Apps. These claims are often repeated with less nuance by consultants who have been certified to train schools and districts on using Google Apps. Unfortunately, as is often the case, the reality doesn't live up to the sound bite. In this post, we will examine the loopholes that permit data collected from students with Google Apps accounts to be used for non-educational purposes.

Google has five main issues that complicate absolute claims about what Google does or doesn't do with data collected from people within Google Apps for Edu.

We'll get into more detail in this post, but the tl;dr version runs like this:

Google defines a narrow set of applications as "core" Apps for Edu services. These services are exempt from having ads displayed alongside user content, and from having their data used for "Ads purposes". However, apps outside the core services - like YouTube, Blogger, and Picasa - are not covered by the terms of service that restrict ads. The same is true for integrations of third party apps that can be enabled within the Google Apps admin interface, and then accessed by end users. So, when a person in a Google Apps for Edu environment watches a video on YouTube, writes or reads a post on Blogger, or accesses any third party app enabled via Google Apps, their information is no longer covered under the Google Apps for Education terms.

To put it another way: as soon as a person with a Google Apps for Education account strays outside the opaque and narrowly defined "safe zone" everything they do can be collected, stored, and mined.

So, the next time you hear someone say, "Google apps doesn't use data for advertising" ask them to explain what happens to student data when a student starts in Google apps, and then goes to Blogger, or YouTube, or connects to any third party integration.

Background

Google has been making a concerted effort to improve its privacy practices in education. In early 2014, it came to light that Google was data mining email in education products. This was followed up a few months later by the announcement that Google would no longer display ads in core Google Apps, and would no longer scan emails in Apps for EDU.

This shifted practice appears to be the origin of the claim that "Google doesn't collect any data on students." This post by Tracy Mitrano gives a more detailed overview and background.

There's A Hole In The Bucket

In an earlier post last week, I explored some basic issues with even finding the Google Apps for Edu terms of service. In that post, I also outlined some quick and easy fixes for some of the more basic problems.

One of the problems identified in the earlier post has been fixed in the last week: the link to the page that outlines the core services now actually points to the correct location. The list of apps covered under the core Apps for Edu terms includes Gmail, Calendar, Drive, Hangouts, Sites, Contacts, Groups, and Google Apps Vault.

The list of additional services not included and covered under Edu terms includes Blogger, YouTube, Maps, Custom Search, Picasa, and Web History.

So, if a school using Google Apps for Edu wanted to do a unit on digital citizenship and time management and use Web History as a teaching tool, the only way to do that would be to throw student data into Google's normal terms of service, where student data could be mined and sold.

Additionally, while Google's specific terms for edu state that search data would not be scanned for "Ads purposes" it looks like searches via any custom search appliance would be scanned and mined. I'd love to get clarification from within Google on how data in custom searches is handled.

When the administrator of a Google Apps for Education instance enables non-core services covered by different terms of service, it's not particularly clear to admins that different terms apply.

When end users access these services, they do it under the umbrella of their Google Apps account. From an end user perspective, it doesn't make sense that these services would be under different terms, and the login process does nothing to highlight that users are entering a different part of Google's corner of the web, governed by different rules. We go into additional detail on how this works later in this post.

Integration with Third Party Apps

The issues outlined above for non-Core apps are worse for third party integrations available through the Marketplace.

Third party integrations are enabled by admins within the Google Apps Admin console. Once these apps are enabled, users within the Google Apps domain can access these additional software packages. "Integration" usually starts with single sign on and a common identity between the Google Apps domain and the third party vendor, but it could potentially also cover sharing contacts and other data. It's not always clear and obvious to Google Apps admins that they are creating an environment where learner data is flowing to third party vendors. Additionally, when a learner or teacher accesses an app that has been enabled via Google apps, it feels like part of a unified experience. It's a great user experience, but it's a data privacy nightmare. Because the integration is clean, it feels like part of the same system, which implies that the same rules would be in place.

However, every time a learner accesses a third party app via their Apps for Edu account, their data flows to the third party vendor, and is governed by the terms set by that vendor. Google's rules no longer apply.

Let's Talk About "Ads Purposes"

In their education-specific terms of service, Google makes the following statement about data and ads:

Claim of no ads

1.4 Ads. Google does not serve Ads in the Services or use Customer Data for Ads purposes.

This statement sounds pretty good. Google doesn't serve ads.

However, it's worth remembering that not serving ads is not the same as not processing or mining data. You can mine data, and derive benefit from what you learn in the process, without serving ads. It's also unclear what exactly "Ads purposes" means - it is vague to the point of meaningless. Google could improve this individual issue in two ways. First, they could define exactly what they mean when they say, "Ads purposes." Second, they could define exactly how they process data collected within the core Apps for Edu suite, and how they use that data.

In section 2.2, Google buries a reference to Non-Google Apps Products in the Compliance section (emphasis added):

Non-Google Apps terms

2.2 Compliance. Customer will use the Services in accordance with the Acceptable Use Policy. Google may make new applications, features or functionality for the Services available from time to time, the use of which may be contingent upon Customer's agreement to additional terms. In addition, Google will make other Non-Google Apps Products (beyond the Services) available to Customer and its End Users in accordance with the Non-Google Apps Product Terms and the applicable product-specific Google terms of service. If Customer does not desire to enable any of the Non-Google Apps Products, Customer can enable or disable them at any time through the Admin Console.

By burying the concept of Non-Google Apps Products, Google makes this element of the Apps for Education terms unnecessarily complicated.

In section 16 of the terms, Google lists out nearly fifty separate definitions, including this one:

Link from section 16

"Non-Google Apps Product Terms" means the terms found at the following URL: http://www.google.com/apps/intl/en/terms/additional_services.html, or such other URL as Google may provide from time to time.

So, for those playing along at home, Google starts with an absolute statement in section 1. They undercut that statement in section 2. They then provide the link to the actual terms in section 16, but the link is buried within nearly 50 other definitions.

When we follow the link to the Non-Google Apps Product Terms, the first point finally spells out the condition that allows user data from within Google Apps for Education to leak into more permissive terms of service:

Not covered. At all.

Not Subject to Google Apps Agreement. The Additional Services are not governed by the Google Apps Agreement, but are governed only by the applicable service-specific Google terms of service.

After knitting together related clauses from three different sections of the terms of service, and following a link to a completely separate set of terms, we finally see that the terms make a clear distinction between core Apps for Education, and everything else. However, because all of these apps appear in the Admin Panel of Google Apps for Edu, and in many cases the person administering Google Apps is not the person in charge of vetting terms for Google Apps, this difference is, at best, unclear.

So What Does All This Mean, Again?

We've covered a fair amount of ground in this post, and gotten deep in the weeds in Google's policies. The way the policies are written, it seems like one clear absolute is that ads will not be displayed alongside user content.

It's not entirely clear, however, what Google does do with any data collected from the core apps within Google Apps for Education.

It is also clear that as soon as a student or teachers leaves the narrowly defined limits of core Google apps, their data is up for grabs to be used for advertising, or any other purpose defined in Google's general terms of service. Unless a Google Apps for Education account is set up in an incredibly locked down setup, it's hard to see how learners can avoid - or even know - where their information is going, and the terms under which it is being used.

But the clear takeaway: as soon as a learner strays outside the core Google Apps offerings, their data can be used for a range of non-educational purposes.

Suggested Improvements

There are a range of ways that Google's terms for education could be improved. The suggestions here are the tip of the iceberg, and ONLY address the issues that make it difficult to understand exactly what Google is doing. Once Google has improved the readability and transparency of their terms, we could go into more detail on specific ways that the terms can be improved to protect student privacy.

To improve some of the issues listed here, Google should:

  • Explain exactly how learner data will be scanned within the core Apps for Edu purchases;
  • Extend the education terms of service for all other Google apps that aren't currently covered as part of the Core apps suite. If there are applications that Google owns where this is not possible, they should be removed from the free offering list and treated like any other third party integration;
  • For third party integrations and Google products that use a different terms of service, add a step into the process for Google Apps domain admins that highlights and explains that all end users will be sending data to a third party, to be covered under different terms;
  • On a regular basis (every three to six months?), Google should email an apps report to the purchaser of the domain and all domain admins summarizing the enabled apps, and which ones fall outside Google's core Apps for education. This way, unused apps could be pruned, and in the case of staff turnover, the existing setup could be reviewed. This would also allow domain admins the chance to review privacy policies and terms of enabled apps within the domain.

There are a host of other things that could be done that include editing the terms of service for clarity. However, the issues highlighted in this post provide some easy starting points.

Orange schools monitoring student, staff social media posts - Orlando Sentinel

Florida school district will monitor social media posts from students and staff, via a third party vendor, Snaptrends. This vendor "will assist district law enforcement and security personnel in monitoring publicly available social media communications that are relevant to school operations and personnel" Or, in other words, a third party vendor will engage in surveillance of kids and staff and will stovepipe concerns to law enforcement.

Civil Rights, Big Data, and Our Algorithmic Future

Solid research and overview of the biases in data collection and use.

How Companies Turn Your Facebook Activity Into a Credit Score | The Nation

This article shows the negative impacts of the profiling that occurs when data is collected on people, then sold and mined.

Migrating from Drupal 7 to Known

10 min read

What's Next?

As you can see funnymonkey.com has quite a facelift. When it was realized that FunnyMonkey would be going through a transition Bill and I reviewed what the future of funnymonkey.com would look like. Historically the reason to keep coming back has been Bill's blogging on education and education policy. So the focus would be on something that worked well as a blogging platform. The net was cast wide and we considered many options including; staying with Drupal, migrating to wordpress, laravel, revel, go, etc.

In the end we chose Known. After having met Ben Werdmüller and Erin Jo Richey at Reclaim Your Domain: The UMW Hackathon Known was already on my radar. Besides being great people to talk with and work with, Erin and Ben have a great vision for Known and a solid architecture. Known is built with the ethos of the IndieWeb movement and the POSSE publishing model. The ethos of Known and FunnyMonkey line up pretty closely.

How do we get our content into Known

Okay now we've chosen Known, we have 10 years of content currently in a Drupal 7 site, now what?

After a cursory review the import and export routines within Known appeared to be hardcoded and as far as I could tell not pluggable. That's a minor disappointment (more on this later). At this point it looked like a custom plugin was the way forward. Known plugins are pretty straightforward and looking at the default ones proved to be quite helpful. For instance take a look at Bridgy's Main.php file (found under IdnoPlugins);


    namespace IdnoPlugins\Bridgy {
        use Idno\Common\Plugin;
        class Main extends Plugin {
            function registerPages() {
                \Idno\Core\site()->template()->extendTemplate('account/menu/items', 'bridgy/menu');
                \Idno\Core\site()->addPageHandler('account/bridgy/?','IdnoPlugins\Bridgy\Pages\Account');
            }
        }
    }

That's it for the minimal plugin, just register some pages and templates. Past that there is an expected directory structure where Known will find the registered page handlers and templates. Again, reviewing Bridgy;


Bridgy/=
├── Main.php
├── Pages
│  └── Account.php
├── plugin.ini
└── templates
    └── default
        └── bridgy
            ├── account.tpl.php
            ├── facebook.tpl.php
            ├── menu.tpl.php
            └── twitter.tpl.php

We see that the call to \Idno\Core\site()->addPageHandler() registers a page handler for account/bridgy located in the PHP file IdnoPlugins\Bridgy\Pages\Account. That's the basic structure. I'm covering Bridgy for a couple reasons;

  1. It's simple: It doesn't take much code to constitute a plugin in Known.
  2. It's included: The code I'm about to show you is my first Known code and is largely oneoff since it is a migration and will not have an ongoing use. So using Bridgy is a bit more illuminating as it's fair to say it is likely ideomatic Known code.

Writing a content migration plugin

Caveat: This is not exemplary code and can be improved in many ways. What it does show you is how easy it is to get content from other systems into Known. There many points worth considering for refactoring, such as storing the new ID to old ID association as the content is imported and not outside of the save routine(s). That said, you can find the code we used over here.

I'm going to defer the detailed points of the code with the hopes that the code is commented well enough and easy enough to read. This will instead focus on the overview of the process.

Assumption

  1. The drupal DB will be available during the import routines. For this we just backed up the FunnyMonkey.com db and restored locally on our developement stack.
  2. The drupal files directory will be available during the import routines. These were just rsync'd from the production site into /srv/www/legacy/files.
  3. The migration will proceed in the following order as depicted by dependencies;
    1. Files: Have no requirements
    2. User: Have user profile pictures and require Files
    3. Nodes: Have authors and files associated and thus require the Files and User imports
    4. Comments: Require nodes
  4. The source content is in MySQL
  5. URL rewrites will be created to map all content
  6. Some method to check old content and new content will be necessary for quality checking

Writing our plugin

Registering pages


function registerPages() {
    // Administration page
    \Idno\Core\site()->addPageHandler('admin/drupalmigration','\IdnoPlugins\DrupalMigration\Pages\Admin');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/users','\IdnoPlugins\DrupalMigration\Pages\User');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/nodes','\IdnoPlugins\DrupalMigration\Pages\Node');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/files','\IdnoPlugins\DrupalMigration\Pages\File');
    \Idno\Core\site()->addPageHandler('admin/drupalmigration/comments','\IdnoPlugins\DrupalMigration\Pages\Comment');
    \Idno\Core\site()->template()->extendTemplate('admin/menu/items','admin/drupalmigration/menu');
}

In order, we register pages for the following details;

  1. Admin page: this will be our overview where we set our database settings. Arguably this could be omitted and just hardcoded.
  2. User Page: This will be the overview for user import.
  3. Node Page: This will be the overview for node import.
  4. File Page: This will be the overview for file import.
  5. Comment Page: This will be the overview for the comment import.

Then we register a template extension to get our 'DrupalMigration' into the menu. This is just a snippet that extends the existing menu to include our options for the DrupalMigration. Review the contents of

DrupalMigration/templates/default/admin/drupalmigration/menu.tpl.php

to see how this injects our menu options into the default menu.

Implementing a page

I'm only going to cover the process for the File portion as that is our first page and is exemplary of the process for all the other pages (excluding the overview page where the db settings are input). The framework for this file is the following;


  namespace IdnoPlugins\DrupalMigration\Pages {
    class File extends \Idno\Common\Page {
        function getContent() {
        }

        function postContent() {
        }
    }
}

We extend \Idno\Common\Page and implement two processes, one for a GET request and one for a POST request. In the file's getContent() method we ensure that only admins can access this page via $this->adminGatekeeper(); then we proceed to build out some tabular data to give an overview of the files to be imported and their status. We store ongoing migration data inside of Known's site config. Arguably we should have used an external table to manage this and would be especially necessary for larger migrations. The filemap which tracks files we have already imported is stored in \Idno\Core\site()->config()->drupal_migration_file_map. Most of this code consists of building up a data structure which we then pass to our admin/file template.

You can review the template in DrupalMigration/templates/default/admin/file.tpl.php. Again this should be better architected to do more of the logic work inside the getContent() process so that the template is just iterating and outputting and not doing any calculations. That said, our template does do a bit of work to present some URL rewrites for those files that have been migrated so that we can include those in our .htaccess after the migration.

For the postContent()> method we again ensure the user is an admin and then iterate over the files and use our plugin classes methods to handle all of the heavy lifting of getting the files into Known. After we process all of the files we redirect via $this->forward(\Idno\Core\site()->config()->getDisplayURL() . 'admin/drupalmigration/files'); back to the same page so the user can see the results.

Additional details

Hopefully everything so far has been helpful. The code could be used as a starting point for other Drupal site migrations into Known. The constants at the top of the file will need adjustment to appropriately grab your content. Assuming you use the same field names for the SQL queries the rest of the import code should largely work. Outside of those constants at the top the following methods will likely need review & refactoring to meet your needs;

  • getFiles(): This currently includes a bunch of unmanaged files and dummies them up to match the managed files data structure. The list of unmanaged files that should migrate will vary from site to site.
  • addUser(): Hardcodes adding a couple users as admins. This could be omitted. All user accounts have mangled passwords between 68 and 127 in length. The idea here is to require users to set a new password via Known's password reset process
  • rewriteURL(): Can be modified to clean up any garbage content and normalize URLs into one particular format. We opted to switch to relative rather than absolute links so that testing would work fine when we were not on the funnymonkey.com domain. This could also be extended to support rewriting node references to other nodes as well, but we opted to defer to 301 (moved permanently) redirects.
  • rewriteContentLinks(): We rewrite content references using our rewriteURL() process so that we can map files to their new destination and normalize on the same process for all content.

Taxonomy is handled by mapping to hashtags appended to the end of the content. See addNode() for more details.

URL rewrites

In addition to each step in the migration rendering a list of rewrites at the bottom of the import screen, Drupal also uses url_aliases that we need to account for.

The following SQL does that for us, we omit all url_aliases that are not users or nodes.


SELECT CONCAT('RewriteRule "^', alias, '$" "', source, '" [L,R=301]') FROM url_alias where source like '%user%' OR source like '%node%';

Points for improvement in Known

Overall the experience with Known was fantastic and a very refreshing experience working with a system with such a tightly focused use case and quality implementation. That said, the following details were points that I saw as potential opportunities for improvement.

Modular import/export process
Arguably this can be better handled with custom code like we did. However, having a modular import/export process lowers a barrier to collaborate and get content into Known. Perhaps the import/export functionality should itself be a Known plugin. In fairness what is currently there handles other platforms that have a standardized export process, and that's a good first step. Besides, Drupal is far from being in a place to have a standard export routine across various implementations. For Drupal there could be a standard views export template that you can map your content into a views export and then a generic Drupal to Known importer that imports data formatted in a particular as defined by the views template, but that's a Drupal project.
AddAnnotation() doesn't return ID
The other processes and methods for saving other Known content all return the newly created ID when creating new objects. This is really a minor nitpick but it made checking the import routine a bit haphazard and prevented a one-to-one on the URL rewrites. In our case we opted to rewrite to the source document rather than the specific comment. While this loses the direct link it does not break the link in the event anybody had linked to the site externally.

Google Wants to Turn Your Clothes Into a Computer - NYTimes.com

Google is working on conductive fabric that can be used to interact with various devices. It's like a data trail in your pants.

Students, Data, and Blurred Lines: A Closer Look at Google’s Aim to Organize U.S. Youth and Student Information

Thorough read on GAFE and Business practices. Solid footnotes. From April, 2015.

Apple's Plan For Managing COPPA Consent Has A Couple Problems

4 min read

For schools running programs using Apple hardware, Apple has set up a guide to streamline the process of creating an Apple ID for each student.

This guide includes a section on creating accounts for students under the age of 13:

 

Under the Children's Online Privacy Protection Act (COPPA), Apple must obtain verifiable consent by a parent or guardian to our Privacy Policy and Parent Disclosure and Consent notice before an Apple ID can be created for a student under 13 years of age (see apple.com/privacy/parentaldisclosureconsent.pdf). As part of the communications process with parents or guardians, your institution will need to assist Apple in ensuring consent is obtained from a student's parent or guardian.

The instructions quoted above indicate that the school is the broker for arranging parental consent. This is a fairly standard practice among edtech companies (whether or not this is good practice is a different conversation). We will also look at the linked Parental Disclosure Consent doc later in this post.

Apple's guide includes step by step instructions for creating Apple IDs. The first step merits a close reading:

 

Step 1. Prepare Apple ID request. To create new Apple ID accounts, you will need to upload a correctly formatted, comma-separated value (CSV) file containing a list of students who need Apple IDs. To download a template, go to Import Accounts and click Download Account Template. To edit the template, use an application such as Numbers, Microsoft Excel, or another application that can save CSV files. To complete the template, you will need to provide the batch number or batch name, student name, student Apple ID, student date of birth, and the parent or guardian email address for each student request.

To highlight a couple key points, Apple's process requires that every school prepare a text file (CSV stands for comma separated values) with the name, birthdate, and parent contact of every student. Text files are notoriously insecure - anyone who gets this file can access the information within it. So, Apple's recommended method for creating student IDs requires creating a comprehensive list of sensitive student data in one of the least secure formats available.

This post explaining the process goes a step further; they use student Gmail addresses for the Apple ID. Practically, this means that if this file was ever compromised, the people who accessed the file would have student names, dates of birth, parent email, and student email.

In case people wonder why this is a concern: when you store data in an insecure format, you expose your students to greater risk - as happened with these students in New Orleans. And these students in Escondido. And these students in Seattle. And these students in Maine.

By encouraging the use of CSV files to move sensitive student information, Apple encourages insecure data handling practice. It's unacceptable for any age group, but it somehow feels worse for students under the age of 13. The fact that this is accepted as sound practice by edtech staff anywhere is also problematic.

 

Parent Disclosure

The full text of the Parent Disclosure doc is essential reading, but we will highlight a couple sections here. For example, after Apple has parental consent, they are clear that they will collect a range of information from all students.

 

We may collect other information from your student that in some cases has been defined under COPPA as personal information. For example, when your student is signed in with his or her Apple ID, we may collect device identifiers, cookies, IP addresses, geographic locations, and time zones where his or her Apple device is used. We also may collect information regarding your student's activities on, or interaction with our websites, apps, products and services.

Apple states very clearly that they will tie location and behavior to an identity for all students, of all ages.

 

At times Apple may make certain personal information available to strategic partners that work with Apple to provide products and services, or that help Apple market to customers. Personal information will only be shared by Apple to provide or improve our products, services and advertising; it will not be shared with third parties for their marketing purposes.

In this clause, Apple clearly states that they (Apple) will use personal information to improve Apple's advertising and marketing.

According to Apple's published policies and documentation, participating in a school's iPad program requires student data flowing to Apple's marketing and advertising, and encourages sloppy data handling by schools.

, , ,

Wherefore Art Thou, Google Apps For Edu Terms of Service?

2 min read

Trying to find Google's Apps for Education Terms of Service page is akin to spending a weekend unicorn hunting while quaffing cocktails from the Holy Grail.

And please, if I have missed an obvious place where Google's current Terms of Service for Apps for Edu are linked, please tell me. I have spent a foolishy long amount of time trying to nail this down, and I would love to know that I had missed something obvious. The shadow of a PEBKAC looms long, and could easily extend into this examination.

Our quest for current Google Apps for Edu Terms of Service leads from Google's Trust page, to the signup page where a district or school would get Apps for Edu, to the Product list, to the product overview page, to the top-level Google for Education page, to Google search (which leads to outdated terms).

It seems like the most reliable way to see the Terms of Service for Google Apps for Edu is to ask people who are already running Google Apps. It really shouldn't be this complicated.

Google could improve this easily by taking the following basic steps:

  • Add a link to the correct Terms of Service from their Trust page;
  • Add a link to the correct Terms of Service from all product description pages;
  • Add a link to the Privacy Policy and Terms of Service from the signup page for Google Apps;
  • Fix the broken link on their Trust page;
  • Add "Updated" dates to the current terms of service;
  • Add text to outdated policies that are no longer active, and link to current policies.

Future posts will address some of the ways that Google's terms allow student data to leak out and be used outside the Apps for Edu terms of service. However, that is a separate issue from basic transparency. For a company founded on making data on the web more discoverable, the opacity of Google's basic terms should be an easy problem for Google to fix.

, , ,