Knight Drupal Initiative
Building Features for Install Profiles
Posted August 18th, 2010 by BillWhen we originally set out to build a set of Features to support the install profile of our Knight Drupal Initiative work, we figured we would use Context to drive the creation of these features.
And this almost worked. Almost.
The Challenge
The challenge we faced in working with features involved sorting out the way Features manages dependencies. When you're setting up a feature, and especially when you're setting up a feature based on a context, the dependencies are automatically brought in and generated for you. So, most of the things that the context relies upon are automatically made a part of your feature (and Strongarm can generally grab the rest). This is great if every single one of your contexts is a completely standalone item. But, if anything in your context contains (for example) a UI element that connects over and exposes functionality that is contained within another context, then you will have two features with overlapping dependencies, and this creates conflicts.
The Goals
At the outset, we had four main goals in setting up our install profile, and the related features used within the profile.
- Create a site that someone could install and start using;
- For site maintainers/non-developer admins, make features easy and intuitive to use by creating a logical set of dependencies; aka, features should contain modular sets of functionality, and be as small and as lightweight as possible;
- Retain the ability to use features to track changes in site config over time. One of the huge benefits of Features is the ability to track changes that you make in config; using the Diff module brings more of the awesomesauce, and here at FunnyMonkey we are serious about the awesomesauce;
- Make features that are as reusable as possible.
Some Other Things That Didn't Work
Briefly, we contemplated shipping an install profile with one enormous feature. Technically, this works, and the install would have been simple, but the maintainability of this arrangement could potentially get complicated over time. Additionally, one feature that holds everything is not particularly reusable, and it doesn't reveal a clean site architecture through a set of clearly defined and managed dependencies. So, while this would have worked, and is a viable approach in some use cases, it didn't align with our goals.
The next option we considered was to use Features to define functionality, and document how to use the Context module to control block visibility and create a coherent UI to connect the various corners of the site. This would have achieved our goal of making reusable features, but it would have required non-technical users to interact with the Context UI in order to get the most from the site. Given that one of the goals of the entire Knight Drupal Initiative is lowering the barrier to entry for newer or novice users, this approach didn't seem viable either.
Hey! You Put Your Feature In My Module!
One of the many cool things about features is that it pushes config to code; to put it another way, it creates a module from config options. Like modules, features can declare dependencies. So, with this in mind, we exported some initial features that contained the overlapping dependencies as defined above. Then, we edited the exported features so that they contained just the components we wanted them to have - this process included removing the overlaps, and maintaining dependencies on both modules and other features. This allowed us to create some base features that contain the central keys to delivering the functionality - things like content types, imagecache settings, fields, etc. Then, we created some extras features that contain - for example - various views, flags, and other mechanisms used to organize information on the site. Finally, we created UI features that contain the contexts and the reactions that display specific blocks on specific pages.
As we worked on the initial site build - and the subsequent revisions of that build prior to building out features - we also paid careful attention to tags, and to maintaining some consistency with how we tagged and organized our views and contexts. This meta-organization of the building blocks within the site helped us as we got down to organizing the build into features and an install profile.
To get a sense of how this comes together, look at the screenshot below.
This page is pulled together via multiple different features, and they are tied together with a final UI feature.
Using this approach, we are able to meet all of our goals as defined above. Additionally, by separating the functionality from the UI that exposes that functionality, we provide more flexibility for people to use whatever means they want to display content. On our site, we use Context to control block visibility and other display settings. However, someone else could just as easily leave our UI features turned off and use Panels, and/or Drupal's core block visibility settings.
For another example, the home page of the site ships with a slide show, and content displayed within vertical tabs (the base theme we are using for this site is Hexagon; there is some real loveliness in there, but that's a topic for another post). This homepage is generated via a UI feature; if someone wants this UI, they can turn it on. Or, people can build out swappable home page UI features that build on the underlying components, and manage these changes via the features UI.
The down side of this approach, of course, is that you can't use the Features UI to build your feature, and the process of defining dependencies manually requires some quality time with Strongarm and the variables table.
Ideally, we could control or manually override dependencies using the Features UI, but patching Features to provide manual overrides of dependencies is no small task, and would likely come at the expense of usability.
By treating features like the modules they are, and by setting up dependencies between them, we can create small, reusable building blocks that retain the maintainability of features created via the standard Features UI. We are getting ready to release our install profile that incorporates this method of building and maintaining features; we still have some testing to do to make sure that we haven't missed or overlooked anything. And with that said, there are likely other ways of solving this, and we would love to hear about them.
Mailhandler and MIME Router
Posted June 22nd, 2010 by BillThe combination of Mailhandler and MIME Router allows you to set up your site to take posts via email, and to route attachments into filefields.
MIME Router integrates cleanly with Filefield, Imagefield and SWF Tools. File paths can be set via the Token module. To use MIME Router, just upload it into your modules directory and enable it; the module uses the allowed filetypes in Filefield's user interface. Aside from adding fields to your content types, no additional config is needed to use MIME Router.
Using this setup, images, video, and audio can be sent to your site via email. This also supports posting to the site from handheld devices that support emailing files. So, if you have ever been in a situation where you used a handheld device to take a picture, video, or audio recording, and wanted to post it to your own (or your organization's) web presence, this functionality is for you. This functionality can also be used to support eyewitness information on local news sites, real-time reporting for school newspapers, and other situations where you want to get fresh information live quickly.
The MIME Router module was written and released as part of our work for the Knight Drupal Initiative.
Notes
Mailhandler settings
- Defaults: admin/settings/mailhandler
- Adding mailboxes: admin/content/mailhandler
MIME Router config
- Nothing!: it uses CCK/Filefield admin settings. Nothing to see here.
Other modules used in the screencast
- CCK
- Filefield
- Lightbox2
- Imagefield
- Imagecache and Image API
- Token (used to create dynamic paths for uploaded files)
- SWF Tools
- Mimedetect
Dreaming
Posted June 11th, 2010 by BillHere is my dream:
A dozen schools join forces to share curriculum created by teachers over the span of an academic year. The curriculum could range from individual lessons to more structured units. They would publish these lessons under an open license (ideally, the Creative Commons Attribution-Noncommercial-Share Alike). Teachers from these dozen schools would publish these lessons on a blogging platform that allowed people to subscribe via RSS, and they would tag them according to subject and grade level.
Over the course of a year, these lessons could be aggregated into a central location. Then, over the summer, they could be organized into more structured collections that could begin to resemble textbooks. These textbooks would have soft spots and missing sections. These missing sections could then be targeted via specific outreach, and during the second year, lessons could be collected that filled these gaps. These lessons could then be aggregated into the main lesson repository, and mixed into the existing texts as needed.
Every year, for every class, teachers create original curriculum. Teachers are already doing this work. The missing piece, of course, is the sharing, which happens less frequently.
I think about this when I read about school districts selling publicly-funded curriculum to Pearson. I understand that school districts feel pinched, underfunded, and under pressure to find new revenue sources. However, the shortsightedness of selling this content to Pearson does nothing to provide a sustainable future for the district. If those texts had been released as an open resource, it would have saved countless districts an enormous amount of money.
Over the summer, we will be releasing our code for our Publishing Platform and Aggregation Hub. These tools will be freely available to all to download and use. This work has many applications; one of the ways it can be used is to aggregate, organize, and republish open content. I look forward to the day when teachers are again viewed as content experts, and purchasing a textbook from a company is viewed as an unnecessary, inefficient use of resources.
An Early Look At Managing News
Posted October 20th, 2009 by BillOver the last weekend, we had the opportunity to install and test Managing News, a Knight Foundation funded project built by Development Seed. Managing News will be released later this week; we were fortunate enough to get an early preview.
Brief Overview
Managing News is an install profile built on Drupal. All of the components used in Managing News are available for free under open source licenses.
In short: it is free to obtain, and it installs like any other Drupal site. For the visual and auditory learners, this video -- produced by Development Seed -- provides an overview of the site.
Managing News contains three sections: Feeds, Search, and Channels.

Each of these sections is covered in more detail below.
Digging In: Adding Feeds
At a first glance, some people might confuse Managing News for a feed reader. This misconception is understandable, as the first step in using Managing News involves adding feeds to bring in content. This process is straightforward: click the "Add Feed" link, as shown below.

Once feeds have been added, information from these feeds will be imported into the site. As data begins flowing through the site, more of the features of Managing News can be used. Every section of the site (Feeds, Search, and Channels) contains consistent display options:

- Title, with summary, organized chronologically;
- List format, 1-2 line summary, organized chronologically;
- Visual representation on a map.
These options allow people to navigate the news in the way that best suits their need. Additionally, as people read through stories, they can share posts that they see by clicking the "Share" icon, as shown below.
When content is shared, the path to the original article is included as the link, and the url is automatically shortened. The included link points back to the original place where the article was posted, so the original source receives credit -- and the resulting web traffic -- for their post.
The site streamlines the process of sharing content to Twitter, Facebook, and via email.
Two Notes Before Moving On
Two additional notes before we move on:
Note 1: this post covers one one way of bringing data into the site -- via RSS feeds. However, content can be brought into the site in a range of other ways, including via CSV import. Some of these additional options are covered later in this post.
Note 2: mapping works out of the box. As posts come into the site, geographic information is automatically extracted. This allows posts to be displayed against a map to highlight relevance to a specific region. To emphasize: this mapping functionality just works, with no additional configuration required. Moreover, it has been designed to be customized and extended as needed, but more on that later.
Search
Once feeds have been brought into the site, information can be sorted and discovered via text-based search. Searches can be saved; this way, if there is a specific type of information that needs to be highlighted or discovered from the incoming information, the saved search can help make this happen automatically.

Saved searches also generate RSS feeds, so people can subscribe to these results.
Channels
Channels provide an additional way to vet, display, and redistribute content. Channels can be created by site members; once a channel has been created, people can tag individual articles to be published in a channel. Like saved searches, channels generate an RSS feed, so people can subscribe to a channel.
Channels are created from the Channels page, or when viewing the search results.
To add a post into a channel, select the active channel, and then click the icon next to the post.
Taking a Step Back
The natural flow of information within the site -- from all feeds, to saved searches within these feeds, to channels that group and recontextualize individual items according to an arbitrary theme -- helps illustrate how the site can be used in different ways by different people within the same organization. The Feeds page functions much like the home page of a newspaper, magazine, or blog: it shows you all of the latest news. For people who are more focused in what they are looking for, the Search page allows them to carve through the content by searches. Finally, for people looking to browse through content on limited time, the Channels provide information that has been vetted/singled out as having greater value.
And this is where the real value of Managing News begins to become clearer: with most products, you can break down the value of the product in an answer to one simple question: what does it do? With Managing News, the breakdown is not as simple, as it does different things for different people at different times. Moreover, these general categories (feeds, searches, and channels) can be used for different things; as one of many possible examples, a local paper could use a channel within Managing News as a tipline; people could email tips into the paper, they could be imported into Managing News via Mailhandler, and the more promising leads could be highlighted in the Tipline channel. A similar process could be used to sort through hashtag-based coverage of breaking stories via Twitter or other social media channels: posts with the hashtag could be imported, and then a selected number of these posts can be republished in a channel -- and, as we discussed earlier, the channel would have its own RSS feed, making the channel a cleaner version with a better signal to noise ratio than the original disparate sources.
A Product and a Platform
In its current form, Managing News provides powerful functionality. The standalone product will allow many organizations to extend their online presence with little to no additional expense. This is a tool that levels the playing field by giving smaller organizations access to tools previously reserved for bigger, richer organizations -- however, it will likely be adopted and extended by various types of organizations because it is both easy to install and easy to extend.
It's very easy to see how an application like EveryBlock could be built on top of Managing News -- with the caveat that Managing News could be developed to simultaneously support the hyperlocal, the regional, and the national. Using building permits as an example of just one of the nearly countless potential data points, a site like Managing News could collect building permit info for any city that made that info publicly available in a readable format. Then, that information could be displayed on a block by block basis within a city (like Everyblock currently does) or it could be used as the basis for comparing building activity across regions, across time, or against other data points that have been imported into the site. The geotagging would need to be modified from the default configuration, but the system has been build to support these types of customizations.
Managing News could also be used to give organizations an internal version of something like Publish2. Where it really starts to get fun, though, is that Managing News-based services wouldn't need to compete with an application like Publish2, they could actually work alongside it in a mutually supportive way. An organization could have their internal system based on Managing News, and then create a publicly accessible channel that would connect up to Publish2 by extending the "Share" feature described earlier in this post.
For our part, Managing News provides us some great opportunities for our own Knight-funded work. The aggregation-collection-republishing workflows can be leveraged as part of our platform, and the fact that Managing News exists allows us to focus in on other aspects of development, such as harvesting data from handheld devices. This collaboration highlights another advantage to developing these tools within an open source ecosystem: in the process of doing our work, we will contribute both code and documentation back into Managing News. The existence of Managing News will improve the quality of our work, and in turn, our work will filter back into Managing News.
Conclusions
Managing News gives organizations a powerful, flexible tool to use as they work online. The functionality of the site is well defined, and cleanly focused. Moreover, the design of the site keeps things looking simple, when there is some fairly complex data management occurring. The official release will be announced later this week; watch the Development Seed blog for the announcement.
On the Road to the Future of News and Civic Media Conference
Posted June 16th, 2009 by BillI'm getting ready for the Future of News and Civic Media Conference, and as part of the preparation we have been putting together a research/development site as part of our work for our KDI project. We are still evaluating the different options that will make it into the initial versions of our platforms.
For this stage of the research, we chose to focus in on the events of the Iranian election -- first, I was woefully underinformed about the events of this election, and given the noise (or, some would say the lack of it) about the event, this seemed like a good opportunity to set up a tool that would provide an overview of the event, with a cross-section of primary source material (largely from YouTube and Twitter) and content from more polished sources (both blogs and traditional/mainstream media). In a few ways, this provides a real use case: if an organization doing grassroots organizing wants to find out about and publicize events occurring in several places on the same day, this type of aggregation from multiple sources allows real-time data collection from disparate sources. People can continue to use the tools they already use to discuss their work, and the main site can collect information from these sources and present/recontextualize it from a central location.
With minimal effort, we were able to put together a rough set of tools that allows people to get a perspective on the events going on in Iran. We built the site using tools freely available within the Drupal community. The bulk of the heavy lifting is done using FeedAPI and friends, and the folks over at Development Seed deserve huge kudos for unleashing these tools into the world. We also used the Views module to split out the content of the feeds. Obviously, the resulting site is a proof of concept, more of a pre-alpha prototype than anything else, but the site is useful as a research tool. We'll also continue developing on the site after publishing this post, so the site will undergo changes over the next few days as we modify things/tinker.
On our testing site, the Twitter traffic provides a pretty scattered overview, but taken in aggregate it allows one to scan raw data over time and get a sense of the ebb and flow of events on the ground. Initially, this feed was pretty free from spammers, but lately some opportunists have taken to using the #iranelection hashtag to get a broader audience for their content.
The YouTube videos provide another means of getting a sense of what is happening. As with any piece of media, the source of the content and bias of the author need to be taken into account. Also, our means of collecting these videos is certain to miss some content, as we are just aggregating the feed for the search term "Iran election protest".
The content coming from MSM outlets and blogs is admittedly arbitrary -- we tended to favor sources that had a clean tag for either Iran or the Middle East; the dearth of effective tagging on information coming from both traditional news outlets and some group blogs is discussed in more detail later in this post.
The important piece of this from our perspective: this tool can be easily built and focused on just about any topic. When it is set in motion, the site will gather and import information that can be used, analysed, recontextualized, or otherwise modified. This can be a tool used for any topic discussed on the web:
- grassroots organizing/community media -- use various data streams to collect information in real time that can be analyzed/collected/synthesized over time
- lesson creation -- aggregate writing about a specific topic, then choose the imported resources that align with your learning goals. Edit these assets as needed, or add in information that is missing
- farmer's markets -- farmers/sellers/market organizers use a microblogging platform to describe what they will be selling, and where; this information can be aggregated and geotagged, allowing an accurate breakdown of what is for sale at local markets.
As we built this out, we encountered some surprises. A short list includes:
The Wall Street Journal uses feedburner as their for their RSS tracking. However, this is exposed in their feeds (or at least in their World News RSS feed), and the original URL of their article points to http://feedproxy.google.com, as opposed to a location within WSJ.com. Additionally, the only tag for all content coming out of this feed is "Free". At the risk of stating the obvious, tagging all posts in your outgoing RSS feed as "free" is worse than useless. I have a hard time believing that they don't have the resources to do this well, which makes me wonder why it is allowed to be so sloppy.
NOTE: The following paragraph was edited because, well, it is completely wrong. The Huffington Post nails syndication. The links on the syndication page point out to RSS feeds of several dozen tags. In short, it rocks.END NOTE
The Huffington Post, which makes extensive use of tags to categorize posts, only offers 4 RSS feeds (Full feed, Latest news, The Blog, Featured posts) of content. Even though you can browse posts by tag on their site, you can't actually aggregate by these same tags. Given how easy it is to expose the content of any tag via an RSS feed, I can only conclude that the choice to not support feeds based on tags is tied to their business strategy. Given how little of the Huffington Post homepage is actually original content, it's surprising to see them reducing the number of ways people can interact with their site.
As a final note on this, it's not uncommon to see other more popular group blogs/major news outlets doing similar things. Talking Points Memo eschews aggregation by tags, and none of the major papers do much beyond feeds that summarize articles appearing in their standard sections. Within their RSS feeds, most major papers do little in the way of tagging content. Additionally, most papers/blogs include little more than a brief teaser within their feed. Given that most of these sites devote a fair amount of screen real estate to advertising (and some, like the NY Times, even embed ads in their feeds), their desire to bring eyeballs back to their sites is understandable.
However, an advertising-driven paradigm seems unlikely to work, and it seems especially shortsighted given that excessive reliance on advertising money is frequently cited as a contributing factor in the decline of newspapers. The new media economy seems unlikely to be a link economy; micropayments, paywalls, and/or "better" targeted ads feel equally fruitless. A remix-with-attribution economy feels more likely, with the looming caveat that no one has really figured out how to make that work in a way that makes all the people in the supply chain happy. But the necessary first step is to move away from the notion that the finished work is the starting point or ending point for profiting from that work; that places too high a value on the role of content, and how people interact with information. Content -- the article -- is the middle point of the process, and on the web "content" can be understood as one point in an ongoing chain of synthesis/recontextualization.
Working with aggregation -- arguably the simplest means of republishing and recontextualizing content -- gives an incomplete yet suggestive view into two elements: how an organization understands the web, and how they view the role of content. From what I have seen, both mainstream news outlets and popular blogs do a poor job of making the most of their content. If I had to reduce this down to a single reason, I would say that there is a perceived need to control how people consume content, and that this is tied to the need to pass eyeballs over ads.
However, tethering the distribution mechanism of online content to a strategy designed to generate more pageviews (ie, News as SEO) seems destined to fail, as the gimmickry of SEO doesn't mix well with unbiased reporting.
And I would love to end this post with the next great idea on how to support working writers within this model, but hey, it's late and I need to pack. But I'm looking to forward to learning more about other approaches over the next few days. I'll write up any ideas as they come, and for those who want to follow along, dip into the feed.









