Feb 20

Feb 20 Getting Real About "Open Data”

Accountability, BaleFire, Best Practices, Big Data, Data, Data Access, Definitions, Open Access, Open Data, Socrata, Transparency

By Dennis D. McDonald, Ph.D.

I recently spent a week at Socrata in Seattle with Jason Hare and the BaleFire Global team. We devoted five days to a “boot camp” on the ins and outs of Socrata’s cloud based approach to implementing “open data” programs.

Time spent during the day at Socrata headquarters (and at night at the Pioneer Square Saloon) provided ample opportunities to contemplate the meaning of the term “open data.”

It’s a tricky term to define. The interests of those involved are diverse.

From a public policy perspective the term is closely associated with concepts like “open government” and “transparency” which can refer to making government operations more visible and understandable to the public. This is done either by (a) providing direct public access to financial, geographic, service, and other data that can be manipulated or analyzed online or by (b) making raw data available to public or private sector organizations that make the data for a particular purpose visible, usable, or accessible, sometimes for free, sometimes for a price.

If you try to draw a Venn diagram reflecting the various concepts touched on by a definition like the above you quickly get tangled up with terms like “privacy,” “secrecy,” “public engagement,” and — a reference that is surprising to me — “open source.” “Open data” seems to be one of those portmanteau terms that can mean a lot of different things depending on where you’re coming from. So while it’s potentially very useful as a rallying cry when you’re initially defining or organizing a community, the meaning of the term tends to crack when you look at it too closely, somewhat like the obsolete term “Web 2.0.”

That’s okay. It gets us thinking in the right direction. I say this since we are also starting to see more open questioning of what “open data” means. This is a healthy sign. One example is David Eaves’ The Dangerous Mystique of the Open Data Business where he says,

The danger with putting the words “open data” before the word “business” is that it risks making people think that open data businesses are somehow unique. They are not.

Eaves goes on to say that “…Open data is not some magic pixie dust to causes normal business logic to disappear.” In other words, you still have to have a plan, you still need something to “sell,” you still need to understand the costs involved in providing services to target users — even if some of your delivered value is based on government-sourced “open data.”

In Open data has so much promise. But first we need to wrestle it back from the realm of geeks Benedict Dellot says this:

There is a misplaced assumption that everyone knows what data is, and how useful it can be. Yet that’s not the case at all. And even when people are aware of open data, many don’t think of it as relevant or meaningful to their lives or those around them; it’s just something the people in tech city fawn over. And therein lies the rub.

I think Dellot is a bit hard on “geeks” but I get where he’s coming from. I’ve always been interested in technology adoption and the cycles that communities go through when new technologies arrive on the scene. Acceptance involves many different groups working together. In the case of open data and the public we have today a vast array of networking technologies almost universally available for organizing, managing, and accessing data of all kinds. For this network of resources to be effective for promoting “open data” programs it can’t be dominated by any one group, including the “geeks” seemingly demonized by Dellot. As I suggested in How to Make Your Datathon Effort Sustainable, what happens before and after an open data event dominated by technical folks strongly influences the success of that event. You also need business people, planners, and subject matter experts involved, not just analysts or “geeks.”

None of this is new. Involving all the “stakeholders” in a process or project has long been a key recommendation for ensuring success. It’s certainly true of open data efforts as well. Attempting to exclude “geeks” from your open data project or program would be a mistake. You need expertise in handling not only large amounts of data but also in extracting meaning from data. This means making sure that someone on your open data project team has analytical as well as hardware, software, and subject matter skills.

What about the public? How can you ensure that the public — or whatever groups you are targeting — have the necessary wherewithal to appreciate and take advantage of what your open data project is providing? That’s what I think the real issue is, not whether “geeks” have too much influence on whether or not an open data catalog is complete.

As Eaves says in his article, an open data business is a “business” and has to be run like that whether you’re charging customers for something or not. That’s why how you design the services that accompany your open data project is so critical. Customers and the public need to understand what’s available to them, how much it costs, and how the services benefit them. If the “geeks” on your team understand all this that’s great. If not, you need to seek out and involve other areas of expertise, starting with people who understand what enhanced data services can do for your target users and the problems you are trying to solve.

Related reading: