Click. Connect. Learn.

Blogs

How Are Schools Using Apple, Google, Microsoft, and Facebook Explaining Surveillance?

At the risk of stating the obvious, I've been following the news of widespread data collection by the NSA with some interest.

After watching things continue to unfold today - including President Obama's underwhelming defense of the program - these are some random thoughts and questions I have:

  • I'd like banks to get comparable surveillance as civilians.
  • I'd like to see the discussion broadened to include corporate responsibility for just acquiescing to these data requests.
  • Schools that went all in with iPads - how are you explaining to parents that your 21st Century Learning enrolled their children in 21st Century Surveillance?
  • Schools that went all-in with Google Apps or Microsoft EDU - how are you explaining that the benefits of cost savings appear to be offset by passive monitoring of the work within the school?
  • Schools that put a lot of time into building your Facebook presence - how will you explain that, by joining the school community on Facebook, you are also throwing your data into NSA servers?
  • For those of you who spent time analyzing and teaching others the "privacy" settings of Facebook, does this feel like time well spent, considering that - to at least the government and Facebook - there is no such thing as a privacy setting that works as adertised?
  • It sounds like, with Prism, the government outsourced TIA.
  • Given this level of cooperation between government and tech companies, how about we put that spirit of collaboration to work and solve the real problem of veterans waiting years for their benefits? If ever there was a problem that could benefit from good data management, the VA benefit system is it.

And yes, it is unclear how much - if any - student data is getting dropped into the net of data that continues to be given by American companies to the American government. To assume none is an act of willful naiveté that strains credibility.

The one thing I will say for Prism - according to a slide shown in the original piece, the program only costs 20 Million a year to run. 20 million a year, to maintain and update a data store to spy on 300,000,000 people? It is, ironically, an example of efficient government spending. To put that in relative terms, that's only 3 million more than the cost of a single drone.

How Can We Teach Privacy?

Based on recent reports, it sounds like the NSA is regularly collecting data from major phone companies, nine major tech companies, and from credit card companies and ISPs (although it's not clear whether the credit card/ISP data collection is ongoing or intermittent).

The list of companies participating includes many major players within the States; it's difficult to imagine anyone using any technology within the States not using at least two of these companies on a daily basis. Most people likely use more. The list includes:

Prism
  • Facebook;
  • Verizon;
  • Apple;
  • Sprint;
  • AT&T;
  • Skype;
  • Microsoft;
  • Google;
  • YouTube;
  • AOL;
  • Yahoo;
  • Paltalk;

Many of the companies involved have issued carefully worded denials, generally including the phrasing that says that the companies never gave "direct access" to servers, and only shared information when "required by law." However, the phrase "direct access" is so vague as to be meaningless, and the program that allows the data grab could be legal under some interpretations of the 2008 FISA update (pdf download).

Even small amounts of data can have incredible predictive power. A blunt data point such as Facebook "likes" can predict politics, sexual orientation, and be indicative of IQ. Some initial research suggests that "likes" can be indicative of health issues. Anonymized search data can reveal incredibly detailed, troubling information. Small amounts of anonymized data can be used to pinpoint individuals. Access to past location information can lead to precise predictions of where a person will be at any time. The details that companies store - and can therefore share - is pretty amazing.

And, of course, with the data the US government has been collecting, they have an incredible trove of information. From the phone carriers, they have the time and duration of all phone calls. They have the location where these calls were made (possibly from the phone's GPS, and certainly from cell towers). They have the list of who talked with whom. They likely have comparable data from Skype.

A variety of these companies give the government access to friend lists, search histories, browsing histories (think Facebook ads; login "services" from a company on websites), productivity work (Google apps suite, MS productivity tools). From Apple, there is a range of buying patterns and services; if you used the "find my iPhone" service, you've given them some very accurate location data. I'd also imagine App Store browsing habits could be interesting. Getting all of these data points in a single location, where they can be cross referenced, provides an incredibly detailed look at the data footprint created by individuals.

For schools who are teaching media literacy and safe online browsing habits: how do you teach online safety and privacy in a world where there is effectively no privacy setting?

For schools that are launching 1:1 iPad initiatives along with Google Apps: how do you talk about the privacy of student data when the company responsible for safeguarding your student's information could be handing it over to the government? The move to hosted services - such as Google Apps for Education or Microsoft EDU - has been a steady move toward convenience at the cost of privacy. At what point does that cost become too high, or too unpredictable?

The details are still coming out on this story, but the outlines that we have now show that any notion of online privacy needs to be rethought.

On a closing note, why is it that our government and our tech companies can work together to assemble the technology to spy on our entire citizenry, but can't work together to get benefits to the veterans who fought to protect the freedoms our government and corporations are now trampling? Priorities, people. Priorities.

For those wanting to learn more, Bruce Schneier has a good writeup on the details of the spying program, and its implications.

Image Credit: "Prism" taken by viviandnguyen_, published under an Attribution Non Commercial No Derivatives license.

Re-usable content, back to the basics, and knowing your tools

I've been on this kick lately of looking at the original architects of the web and what people were doing at that time. This picked up steam in January, 2013, right after our Open Content Authoring Day at EduCon when we started looking at the needed structure to support flexible open content authoring. This led to looking at the origin of REST; the functionality is available in the HTTP standard, but over time we (meaning a lot of people doing web development) just ignored it because of imperfect implementations and assuming everything was HTML, and (un)fortunately things have mostly worked.

There are some issues with how people currently imagine web apps because we:

  • Bolted on a lot of "features" that broke or hid the underlying methods and protocols;
  • We assumed everything was HTML;
  • And as a result created content with embedded markup.

As we move forward, we should think about HTML, JSON, etc. as delivery types not working types. That is, HTML and JSON are presentation issues, not editorial issues. This is the initial thought process we need when working with chunks. I say initial because we will likely expand and include methods or techniques to apply some structure around disparate chunks, but that is secondary to creating re-usable chunks and requires additional structure that is not intrinsic to the content itself. Concentrating on an API first should help us, and reviewing the original intent of the architects regarding REST APIs can provide insight into working with the tools we have rather than things working despite the tools we have.

This article, from 1997, has a lot of good thought around content, structure, and related issues. Fortunately, the transpublishing issues mentioned in the article are not as relevant to work with open content.

This piece, from 1999 (and again from Ted Nelson), defines the roots of some of the problems we have yet to shake. The more things change, the more they stay the same.

Except, unfortunately, some things that should never change often do - like URLs. In 1998 Tim Berners Lee explained that Cool URIs don't change. He even explains content negotiation, which can be used to help manage URIs and avoid exposing implementation details.

In order to progress we need to review our assumptions. As a trivial example when we see things like; 'page.html', 'page.php', or 'page.asp' show up in URLs we have to stop and think "why are we exposing implementation details?"

These problems are tricky to solve and some would argue all abstractions are leaky, and thus any implementation is going to expose some level of implementation detail. Regardless of whether or not that view is reflective of reality, each level of implementation detail that we do expose limits our choices for changing our implementation in the future.

As we move forward with the development of a better open content authoring platform, the mistakes of the past and present are close by. Architectural habits, and exposed implementations became an unquestioned part of system design. This is to the detriment of software getting built, and to the detriment of the people working with the software. Getting back to the basics avoids making the same mistakes.

I Have Some Word Docs...

A question we are asked on a fairly regular basis is:

I have a bunch of resources saved in Word docs. How can I release these as open content?

It's a pretty straightforward question, and one version of an answer is:

  1. Specify a license; and
  2. Publish your content online in something like Google Drive or Dropbox.

However, there are details and decisions buried in the steps outlined above that make this seemingly straightforward answer remarkably serpentine. This blog post is an effort to collect up some of the the various possible answers to that question in a single place.

Specify A License

First, the easy part: if you are the primary author of the work, and/or your work uses/remixes openly licensed content, all you need to do to make your content open content is choose a license, and specify how you want to be attributed. Then, include the license and your attribution information with your work, and voila! Done.

For example, this blog post (and all blog posts on FunnyMonkey.com) is licensed under a Creative Commons Attribution Share-Alike license. On my personal work, I ask that people link back to the FunnyMonkey home page.

So, my attribution text would look like:

This work is licensed under a Creative Commons Attribution Share-Alike license. If you reuse this work, please attribute authorship to Bill Fitzgerald at http://funnymonkey.com.

This formula will work as a start: This work is licensed under LICENSE. If you reuse this work, please attribute authorship to AUTHOR NAME at SOME LOCATION.

For people who want to learn more about the process and see more options, Creative Commons offers a form that will help with selecting a license and generate this text for you.

Publishing the Content

Once you have a license and attribution text for your work, the only piece that's left is publishing your work. As mentioned above, putting docs into something like Google Drive and DropBox is a viable option. People can access the work, and they can grab a copy of it themselves. Regardless of what you use, however, you want to make sure that it does not convert your content into a format that is more difficult to reuse than what you are starting with.

The choice of where you publish your content online has implications for two facets that are important when working with open content:

  • Ease of discovery - or, how easily can someone find your content via search; and
  • Ease of reuse - or, how easily someone can use or remix some of your content into a new work.

There are a couple nice things about throwing your content online in something like Google Drive or Dropbox. First, it's fast, and this implies a level of convenience. Second, it's easy, and many people know how to copy something from Google Drive or download something from Dropbox.

However, neither Google Drive or Dropbox is that good when it comes to ease of discovery. Additionally, using a proprietary service means that your content lives at someone else's place, rather than at a space you control.

An option that both increases the discoverability of your content and allows you greater control over where your content resides is to set up a blog using something like Wordpress. If you start with their service, you can choose to migrate to your own hardware at a later date. But, the most important element of using something like Wordpress is that you can publish summaries of your content as blog posts (which will make your content easy to find) and then upload the individual word doc containing the lesson.

Then, when you have all of your content uploaded with explanatory summaries, you can create a meta-post that links all the pages of your content. This combination takes more time, but it increases the likelhood of someone finding and understanding your work. An additional advantage of using blog software to share your content is that it provides a means for people interested in collaborating on the work an easy means to contact you.

It's also worth noting that uploading word docs should be viewed as a transitional step, and that the eventual goal should be text and supporting media that can be browsed online. However, the conversion from a doc stored in Google Docs, Word, or Libre Office takes time, and for many people that time can make the difference between sharing and not sharing.

An additional step to help other people find your work includes adding an entry in OER Commons. OER Commons acts as a clearinghouse for all types of OER, so being listed here will help more people find your work. Listing your work on OER Commons is a viable option whether your work is stored in Google Docs, Dropbox, Wordpress, or some other site.

Closing Thoughts

In closing, if you have directories full of work that you have created and want to share as open content, the shortest, fastest path to doing so is to put the work online in something like Google Drive or Dropbox. While these services are fast and convenient, they are also less than ideal when it comes to supporting community around open content, and reuse of open content.

Using blogging software to share content makes it easier for other people to find content, and starts to incorporate the possibility of collaborating with other people on maintaining the content.

For those looking to get started sharing open content, though, the only way to get it wrong is to not share. Choose a license, and get it on the web. The only way to ensure that your content has a limited impact is to not share in the first place.

If We Need Data, Who Collects It?

Getting data on how people learn and how they can be supported while learning is a worthwhile goal.

However, collecting data takes time, and the means by which these data points will be collected and stored have yet to be identified.

Even capturing a subset of the data specified in the CEDS standard (and this is the data standard implemented by InBloom) requires a significant investment in - at the very least - time, technology, staff training, and data analysis skills.

Kindergarten and the Common Core

I would also wager that many districts are also swamped with the rollout of the Common Core standards, to the point where their ability to articulate their guidelines about data collection is limited or incomplete.

In looking at the most obvious ways that this data will be collected, a few options come to mind:

  • Standardized tests - these are easy data points to collect, and there is an obvious push to use standardized assessments to fill in data points. However, given that standardized tests that align to the Common Core don't exist yet, and that the early versions are expensive, flawed, and controlled by vendors, the quality and value of this data is suspect.
  • Data collection via educational software - This approach simplifies the process of gathering data (as the machines track interactions) but we should not confuse quantity with quality. Additionally, if we begin to rely on apps to collect data about learning, we are getting into the process of building software to meet two distinct goals: tracking specific data points, and teaching lessons. Given that tracking data points is what pays the bills, the data collection can easily trump the learner needs. Additionally, the educational value of educational software is very much an open question.
  • Teachers collect data - With this approach, the job description of a teacher expands to include increased data collection responsibilities. Teachers already do a fair amount of this (it's called grading!), but the amount of time to implement some of the data collection schemes being floated as an option require significant time to implement (what does data collection look like for a high school teacher who sees 150 students a day?) as well as training in best practices for data collection.

Any of the approaches are going to be flawed. Technologically mediated solutions - via standardized tests or learning software - eliminates time that should be spent in more authentic learning. Pushing the burden onto teachers reduces the amount of time teachers have to work with students, and changes the nature of the job. There are a lot of initiatives calling for increased data, and increased use of data. These calls for increased use of data, however, generally ignore the time and effort required to collect useful and reliable data. In the end, if the data collected is going to be useful in supporting teachers and students, collection will likely fall on the backs of teachers and school staff - to the detriment of instructional time, at the expense of more constructive time with children.

Syndicate content