Click. Connect. Learn.

Bill's blog

Simple Is Usable, or Why Friends Don't Let Friends Apply Metadata Prematurely

As part of our work on open content, and how to design systems that support authoring and translation that are both useful and usable, we have been thinking about the role of metadata and, by extension, search. This post contains some incomplete thoughts - a line in the sand, more than anything - and, six months from now, will provide something for all of us to laugh at. Possibly, we will all be able to laugh at this sooner than that. Time can be cruel.

In other words, I am firmly reserving the right to recant any or all of what I'm saying here. I'd love to hear different viewpoints on this.

Keep Data Simple

This sounds - and is - pretty basic, right up until it's time to implement an actual system. However, as soon as it's time to build a system, people "just need this one field."

In building data systems, additional fields are the equivalent of scope creep.

faceted

Humans Should Only Enter Metadata In Precisely Defined Circumstances

We'll get to this in more detail later in this post, but whenever possible, metadata should de derived from the data.

In some cases, this is simple: the author of a piece of content is easy to derive. Ditto for the date a piece was created.

A good example of metadata that should be entered by a human is a license.

But, in the case of a person remixing data that uses different licenses, the pool of possible licenses for the remix should be derived.

Your Picture Is My Image Is Her Binary

A system-defined metadata can be useful, but it will be most useful for the people who designed and built the system, as they are the ones who define the system-specific meanings of the metadata terms.

In other words, your metadata will be useful to you, but it might not be useful to your users. For better or worse, metadata is rooted in language, and words carry baggage and connotations that, among a large group of individuals, make a universal meaning elusive at best.

With this in mind, the "best" metadata is often good search.

But Community Tagging Is Awesome

No, it isn't. Community tagging creates the appearance of structure and organization when what you really have is a chunky stew of chaos.

If you can get enough people contributing tags, then - maybe - you will be able to pull some signal from the noise, but that also assumes a large number of people and a robust search technology.

Faceted Search: Blech or Ugh?

In designing search systems for sites, faceted search can be useful at providing structure when sifting through content. However, is faceted search something that we actually appreciate,or something that we have grown accustomed to?

On Google, how often do you use faceted search, or go beyond the options that you can access via the advanced search UI?

If faceted search went away, or was replaced with facets generated from metadata that could be derived from the core dataset, what would be lost? Anything?

Look at your search habits. Identify if or when faceted search saved you time. In situations when you use faceted search, was faceted search essential, or could it have been replicated by full text search?

Search Has Its Limitations

But with all that said, search has its limitations.

Understanding how stemming works (or doesn't work) is essential to interpreting the results we get.

And this is more complex when we work with translated content in multiple languages.

"Just In Time" Metadata

There are times and places where good, structured metadata is essential. By separating out the metadata requirements from the actual dataset (and keeping the core data as simple as possible) you help ensure that the quality of your underlying data remains high.

Implementing a metadata structure around data is firmly in the domain of a context-specific application.

In terms of open educational resources, this allows for easier reuse of the data. If a piece of content was written in the US, a school looking to resuse that content in the UK won't care about the Common Core alignment of the resource.

To put this another way, inflicting a metadata standard on your data (as opposed to applying metadata within an application that uses the data) makes your data both less portable and less useful.

Portability

In listening to people who are writing and using open content, a key barrier we hear about repeatedly is portability (there are others as well, and these other issues will get their own posts).

A barrier to portability - and really, to the usability of authoring and translation platforms that support open content - is the premature and often unnecessary application of metadata into the underlying data. If we keep the data as clean as possible - which means resisting the urge to apply metadata without a compelling need - we can simplify both portability and usability. Metadata should be applied as part of an application that uses the data, when there is a clearly defined need to catagorize the data. And then, the categorization should be done by people who know what they are doing.

It doesn't matter how good your categorization system is if it is applied to your data inconsistently, and/or if no one uses your data.

Image Credit: "faceted" taken by jenny downing, published under an Attribution license.

Common Misconceptions Around Common Core

There's no getting around it. The Common Core standards bring out the crazy.

Benjamin Reilly does a good job of collecting the crazy in one place, but his "alert" highlights a real issue: the amount of disinformation about Common Core has the potential to derail any rational discussion about the standards.

So, for those following along at home, here is a high level breakdown of the elements of this discussion. At the outset, I want to stress that this is a summary, and that there are certainly things I am missing and/or getting wrong. Please, point out these myriad shortcomings in the comments.

The best place to start is with the Common Core standards - these are learning standards, plain and simple. There are things to like and dislike about them in their own right, but the standards are just that: standards. My preferred starting point for analysis of the standards and their implications is Tom Hoffman.

Of course, new standards require new curriculum aligned to those standards. Thank goodness, some of the people that participated in writing the standards are ready with products to sell that make sure districts meet those standards.

The Federal Race to the Top program (and it's worth noting that there are different strands of Race to the Top) emphasized adoption of Common Core standards and the implementation of student data systems. In Race to the Top, when you see language around "college and career ready standards" that is generally a stand in for Common Core. Whenever you see language around personalized learning, bringing data-driven decisions to the classroom, and/or identifying teachers with a track record of success, the means to achieve these goals are generally understood to include a comprehensive data system.

A representative sample of what this language looks like in the Race to the Top documentation is included below:

Under Proposed Priority 1, applicants must design a personalized learning environment that uses collaborative, data-based strategies and 21st century tools such as online learning platforms, computers, mobile devices, and learning algorithms, to deliver instruction and supports tailored to the needs and goals of each student

The federal data standard is at CEDS; inBloom is implementing the CEDS standard in its datastore.

When the Obama administration allowed states to get waivers for NCLB, the conditions for getting waivers reinforced some of the incentives in Race to the Top, including Common Core adoption and using student test scores as part of teacher evaluations - which, in turn, reinforced the need for a comprehensive data system.

Another facet related to - but separate from - Common Core are the new tests that accompany Common Core adoption. These tests have been referred to as the Next Generation of Assessments, and have been discussed in many places; this speech from Secretary Duncan in 2010 provides a good introduction to the concept. A recent flare-up over some of the new tests - in this case, written by Pearson - sparked an Opt-out movement in New York. Gotham Schools looks at some of the good things in the new tests.

So, a short version - we have:

  • Common Core standards;
  • New curriculum, aligned to the Common Core standards;
  • New standardized tests, aligned to the Common Core;
  • Centralized data systems to collect information on students and teachers;
  • Race to the Top, which gave money to states and districts that prioritized implementing the above components;
  • Waivers for NCLB, which reinforce some of the incentives for Race to the Top.

And, of course, this is happening against a political and social backdrop that includes heated debates about the worth of teachers unions, intense and well funded efforts to privatize public education, the agressive expansion of both for-profit and non-profit charters, cheating scandals, a narrative about how our school system is failing, and an increased reliance on standardized tests as a measure for both teacher effectiveness and school success. All of these elements are related - but ultimately distinct - strands in the conversation.

This web of related-but-separate elements makes it simultaneously honest but disingenuous when advocates for Common Core say things like, "The new standards don't mandate what teachers teach." This is honest because the standards, with some glaring exceptions, attempt to stay out of implemetation. It's a disingenuous statement, though, because the implementation of Common Core is embedded in these other elements that do place constraints on educators.

But, when you have Glenn Beck and Michelle Malkin adding their misshapen four cents to the conversation, one thing is nearly certain: the progressive left will support whatever they argue against. This is a lost opportunity, because the educational system in the entire United States would benefit from a clear discussion of Common Core. The present direction of the conversation makes that increasingly unlikely.

Open Educational Resources, Professional Development, and Public Money

Yesterday, Darren Draper put out a post expressing some concerns with Teachers Pay Teachers. Shortly after putting out that post, Darren was forced to don his flame-and-troll-proof suit, as the comment thread got, well, interesting.

I'll get to the discussions in the comment thread later in this post, as a majority of the comments are illustrative of a small part of a larger problem.

OpenWashing, Teachers Pay Teachers Edition

Teachers Pay Teachers markets itself as "An open marketplace for educators where teachers buy, sell and share original teaching resources." In this context, Teachers Pay Teachers (or, TpT) provides a clear example of how the word "open" has been mangled beyond recognition.

Money

For those of us working in open source and open content, our notions of openness generally share some common pedigree with the four freedoms of free software, the definition of Open Source, and the Creative Commons licenses. It's worth noting that, even within these broad definitions, there is often vehement disagreement as to what constitutes open. However, even while acknowledging that there is no universally accepted definition of what "open" really is, it's still safe to say that TpT isn't it.

"Open" does not equal "being on the internet."

TpT is a marketplace, and this is fine, but a marketplace that anyone can enter isn't an "open" space, at least not in the context of Open Educational Resources. TpT puts technically unneeded barriers in the way of reusing content; the most obvious of these barriers is the need for a login even to download a free resource. The business need of TpT (collect contact info) is in direct conflict with greater openness, and TpT lets the business need trump the tendency to be more open.

And, of course, this is fine - it's just not open. If, however, your actual practice conflicts with your marketing catchphrase, that's not good.

I'll return to the TpT at the end of this post, but now, we're going to jump into Darren's post.

Private Time, Public Money

Darren lays out seven reasons why he struggles with TpT. I'm highlighting 5, 6, and 7, below:

5. Public school teachers are paid by the taxpayers - with public funds - to work during specific hours of the day.
6. The computer and other equipment used by public school teachers were all likely purchased by the taxpayers, using public funds.
7. It is my belief that classroom activities, assessments, games, handouts, outlines, posters, printables, research, worksheets, and the like - that have been created by a public educator during work time or with school-owned equipment - belong to the public and should therefore be licensed with an appropriate, open license. Resources created with public funds should neither be bought nor sold by teachers because they were never the teacher's to sell in the first place. Because these resources were created with public funds, they belong to the public.

I checked the comment thread on Darren's post before staring to write this response. When I looked, there were 34 comments - 19 of those comments focused on when content was created - and this is illustrative of the larger problem.

The question of who owns teacher-created content - and the nuances of the time of day and equipment used to create the content - came up in several of the Open Content Authoring events we ran over the last several months.

Our advice to this question in the short term:

  • Work on your curricular material outside of school hours, and use your personal account. Store a copy on personal hardware (an external hard drive, a personal blog, a personal Google Apps account, etc);
  • Let your district know that their policy on intellectual property creates an unnecessarily adversarial relationship around curriculum planning;
  • Let your district know that their policy on intellectual property creates a disincentive to you doing your best work, as the only way you can maintain ownership over your work is to do it outside "normal" working hours on your own equipment;
  • If you belong to a union, bring this to union leaders as an issue that needs to be on the table as part of contract negotiations;
  • Incorporate a piece of Creative Commons Licensed content into EVERYTHING you do for your work - make sure it is licensed under the Share-Alike clause. This means that your District can claim ownership of it, but that due to the nature of the license, you (and anyone else) is free to reuse it under the terms of the CC license.

In Darren's comment thread, the fact that so many commenters were fixated on the timing issue flags the reality that people are having a hard time seeing the forest for the trees. Fighting about the time of day when you are allowed to maintain control of your creative output means that you are living in the box that people laid out for you. Fighting about the time of day when you can do your work means that your perspective is limited at the outset. This comment illustrates the predicament perfectly:

I was so with you on this post, until it hinted that items were most likely being created on school time and/or with school equipment. I would encourage you to spend a week with me to see that I don't have enough hours in my "school day" prep time to make my weekly schedule, copy/assemble resources, grade papers, record grades, communicate with parents, and supervise my students during additional remediation opportunities. I consider myself lucky to sneak in a second bathroom break each day! :)
All of my TpT products are made by me, at home, on my personal equipment with software I've purchased myself (my classroom computer is a desktop that is over 8 years old) .That's *after* I have spent numerous additional hours per week grading papers, inputting grades, and emailing parents (from home, on my own computer). My dear husband can attest to the hours he has spent helping me cut, laminate, recut, and assemble centers for my kiddos.

The workload issues here sound very typical of most teachers that I know. There is not enough time in the workday to cover their professional responsibilities, so work comes home. Work spills into weekends. Budgets for supplies have been slashed, so teachers buy supplies out of their own pocket. School equipment is outdated or locked down to the point of unusable, requiring much prep to take place outside of school networks, on non-school machines. Teaching doesn't fit into the hours defined in most contracts, and teachers put in significant time outside of traditional working hours, in addition to spending their own money on class supplies.

And this is the conversation we should be having: why are teachers expected to power the underfunded mandates of increased reporting in the era of high stakes testing, with fewer resources, less support, in a work day that doesn't have room for all the demands on teacher time? Districts that have policies that claim ownership of teacher intellectual property are perpetuating that absurdity, and this absurdity needs to be addressed and clarified in employment contracts. Unions need to make this an issue as well.

Lessons Are Not The Ultimate Goal

The problem - and a shortcoming - of both traditional textbooks and content silos like TpT is that they treat a lesson as the stopping point. This makes sense for them, because both textbook companies and TpT make money from distribution. If there is no sale, there is no revenue. From a business place, this makes perfect sense.

Creating and using open content approaches the same problem - how do I get the best possible material to my class - from a different place. Teachers can use open content exactly as they would use a textbook, or a piece of content purchased from TpT; for many people, that is where their understanding of open content ends. However, that vision of open content is incomplete, and rooted in our habits of using material with restrictive licensing.

There are different levels of using open content; teaching lessons that use open content is the starting point. Remixing material that incorporates two or more openly licensed sources is a next step. Releasing that remixed version is the next step. Collaborating with other people to edit and remix content is an additional level of involvement.

And, if you look at the trajectory of using open content, it resembles the trajectory of learning. It's not a transaction (go here, buy this) - it's a series of interactions of increasing complexity, each of which requires judgment and expertise. Over time, building and using open content develops a professional network and a collection of domain level experts to work with. Working with people to create open content is some of the best ongoing professional development out there, and districts would be wise to embrace and support this reality. Rather than make absurd claims over ownership of teacher IP, they could divert some professional development money into supporting teacher time in a facilitated authoring process that spanned the course of a year. The resulting material could be released under a Creative Commons license, ensuring that teachers and the district were given the appropriate credit for their role in creating and funding the work, and material created with public money would remain available for public use.

Image Credit: "Money, get away!" taken by kiki follettosa, published under an Attribution Non-Commercial Share-Alike license.

Barriers and Contradictions

Last weekend, we ran another open content authoring session at Lewis Elementary in Portland, OR; we'll have more details on the event in a post laster this week. During this session, we talked with several educators about ways to work around the organizational barriers they face. I'm going to list out a couple here; frequently, when we talk about the things that are absent from school learning environment, the conversation stops at blockages of YouTube and other social media sites. Really, though, there are barriers that are far more basic and pervasive than that.

Contradiction

Students Can't Save HTML Files

We spoke with an educator working within PPS who had set up a lesson where students were learning about the web, including some basic HTML and css. The lesson went fine until it came time for students to save their work; they were blocked from saving html files.

SSH is blocked

We have worked in schools, and worked with teachers in schools, where SSH is blocked. For anyone working in web development, SSH is a central tool to doing out work. Blocking SSH is akin to teaching carpentry without hammer and saws.

Districts Claim Ownership Over Teacher Intellectual Property

The way some district contracts are written, districts claim ownership over any work that is done during school hours, over a school network, or on a school-provided machine. So, if a teacher does planning during the school day, even if she is creating something entirely new that is her creation, the district position is that they - the district - own that work.

Why Should We Care?

In the current political climate of educational reform, teachers are under a tremendous amount of pressure. Teachers and schools have a lot of rhetoric directed at them about how they need to embrace "21st century learning" and teach web literacies and develop knowledge workers, all while meeting more time consuming reporting mandated by the unfunded mandates of NCLB, and having their performance measured by standardized tests that often don't examine what learning looks like.

And in the face of all this, there are district-level policies that directly interfere with a teachers ability to work. When a district claims ownership over creative work done during the work day, the district creates an enormous disincentive to work with peers during school time, as any result of the collaboration would be owned by the district and not the creators. This flies directly in the face of what networked, connected teaching should be, as it is predicated on sharing our work with others. Fortunately, as we discussed in our authoring event, incorporating openly licensed materials into our work makes district claims of ownership a moot point, as the district can still claim ownership but the license allows for free and universal reuse.

What is incredibly heartening is talking with teachers, and hearing the creativity, thought, and caring that they put into their work. There are some amazing educators working to help our kids learn, and it's great to see.

What is disheartening is to see the artificial, policy-driven barriers put in their way. Here in Oregon, we are hearing a lot of talk about improving our educational system. And some of these things actually sound okay. And please don't get me wrong: high level change is part of the solution too. But we also need to remove the unnecessary barriers to teachers doing their best work. The notion that a district owns a teacher's work needs to be addressed legislatively, and through contracts. If a district thinks that they are going to get rich from owning and selling content, they should go talk to their local newspaper - the one that went out of business three years ago.

Image Credit: "Contradiction" taken by sweetenough, published under an Attribution Share-Alike license.

Twibbon Provides A Great Example Of An Awful Privacy Policy

Twibbon is a service that markets itself as a tool to support "your cause, brand or organisation in a variety of ways." Twibbon targets Facebook and Twitter, and provides a small graphic that gets added onto a profile picture. This graphic is a visual way to show support for a ause.

After reading through Twibbon's privacy policy I have one question for organizations that use Twibbon: why do require that your supporters surrender all privacy?

The Twibbon privacy policy is remarkably honest when it descibes what it will collect, as it clearly states that it will get your contact information, your location, and other details related to surveys and "offers" (aka, ads and marketing).

RIP Privacy

What we may collect

We may collect the following information:

  • Contact information including email address.
  • Demographic information such as postcode, preferences and interests.
  • Other information relevant to customer surveys and/or offers.

Additionally, Twibbon clearly states that it will track the web pages you visit, what you search for, any interactions with ads. Twibbon clearly tells you that once you sign up for their service, they will track a significant portion of what you do on the web, and save that data.

When you visit the Site, our servers automatically record information that your browser sends whenever you visit a website. This data may include information such as your IP address, browser type or the domain from which you are visiting, the web-pages you visit, the search terms you use, the location of your ISP and any advertisements on which you click. For most users accessing the Internet from an Internet service provider the IP address will be different every time you log on. We use this data to monitor the use of the Site and of our Service, to gather information about the location of our users and for the Site’s technical administration. We do not associate your IP address with any other personally identifiable information to identify you personally, except in case of violation of the Terms of Service.

Because Twibbon targets both Facebook and Twitter, it has the ability to combine data from both sources - and from Facebook, this likely includes information about all of your friends as well. This gives Twibbon a dataset that combines Facebook info, Twitter info, and info about your behavior wherever you go on the web.

This level of information allows you to be identified and tracked with remarkable accuracy. For example, even with fully anonymized location data, with an adequately large data set, researchers can pinpoint an individual based on just 4 locations. Location data is readily available within the data streams from both Twitter and Facebook.

Additionally, Facebook "likes" are also remarkably accurate at predicting a variety of factors, including religion, substance abuse history, sexual orientation, political stance, and introversion/extroversion. Along the same lines, anonymized search data can reveal many details about the searcher.

But of course, Twibbon doesn't need to worry about the constraint of anonymized data, as they have your exact identity within Twitter, Facebook, and the Web. When you support an organization using Twibbon, you are agreeing to have the portion of your life that you put online recorded, analyzed, and sold. Because math and statistics work, that allows people who buy access to your data to make very accurate predictions and inferences about the portions of your life that don't put online - you know, the part of your life that we reasonably assume to be private.

Image Credit: Image found and reused from Eatwifme's Tumblr

Syndicate content