Dennis D. McDonald (ddmcd@ddmcd.com) consults from Alexandria Virginia. His services include writing & research, proposal development, and project management.

NOAA’s Big Data Project Comes Into Focus

By Dennis D. McDonald

Introduction

On April 21, 2015 the U.S. Department of Commerce announced the initial participants in its big data collaboration project. According to Commerce Secretary Penny Pritzker, Amazon Web Services, Google Cloud Platform, IBM, Microsoft Corp., and the Open Cloud Consortium will make up the “anchor members” of the collaboration program to make more NOAA data available to the public and industry:

“As America’s Data Agency, we are excited about these collaborations and the opportunities they present to drive economic growth and business innovation,” said Secretary Pritzker. “The Commerce Department’s data collection literally reaches from the depths of the ocean to the surface of the sun and this announcement is another example of our ongoing commitment to providing a broad foundation for economic growth and opportunity to America’s businesses by transforming the Department’s data capabilities and supporting a data-enabled economy.”

I’ve written about this evolving program before (see here and here). The recent public announcement signals, in my opinion, one of the most interesting experiments with “open data” to yet come out of the U.S. Federal Government; more official details are provided by NOAA here.

This article discusses some of the challenges and opportunities this initiative will face. The opinions are my own and based on 20+ years’ experience in managing data intensive projects for both public and private sector clients.

Background

Making weather and environmental data available to the public is nothing new. In fact, the commercialization of weather data has long been viewed as a “poster child” for what can happen when the private sector innovates successfully using public data to create new businesses and jobs.

What’s different about this newly announced NOAA program is not just the potential “big data” scope of the program but the way in which private sector cloud vendors are involved as intermediaries not only to the public but also to potential data vendors and resellers.

Experiments like this NOAA program need to be encouraged. It’s one thing for governments to state that they want to be more “open” and “transparent.” It’s quite another to turn such policies into sustainable programs that are efficiently run and provide access and services that are valued by users. That’s what NOAA wants to do.

This collaboration experiment will be closely watched not only by other Department of Commerce agencies but also by other civilian and military agencies as well. One very simple reason for this attention is that open data programs cost money.

NOAA believes that collaboration with the private sector is one way to provide data and services to the public while minimizing direct taxpayer cost. As a consultant in the area of open data program management I appreciate what these costs are. As the saying goes, “there’s no such thing as a “free lunch.” You need planning, collaboration, and the participation of both business people and technologists. You also need standards, technology, stakeholder involvement, governance, and leadership.

Given that, what follows are some points that we need to track going forward if we want programs like NOAA’s to succeed.

The changing role of the Cloud

In the old days making data accessible to users outside an organization meant establishing a secure interface to an internal data store via locally managed resources. Nowadays cloud vendors (such as NOAA’s collaborators) provide a range of services that include not only data management but desktop virtualization, application and user support, and a range of services that can effectively “replace” expensive company owned data centers and computer facilities.

In NOAA’s case internal data operations including its supercomputer facilities that support modeling and data management won’t be replaced, they’ll be augmented. NOAA still has to support its ongoing programs’ data collection, analysis, and modeling infrastructure, as required by its enabling legislation and funding. NOAA’s internal and external stakeholder groups that have evolved over that year around the various programs won’t change overnight, either. In fact, some of these programs might even resist changing in response to the development of new data delivery channels. (That’s speculation on my part but I wouldn’t be surprised.)

Still, the existing landscape of NOAA data users is bound to change over time as data access and usage barriers continue to change.

Other programs in the Federal government will be watching the NOAA experiment (and I’m using the word “experiment” on purpose). NOAA data appeals to many different user groups worldwide. The cloud mediated channels for delivering data related value to these varied groups will also be varied, ranging from mass-market commodity type services to high-end and narrowly targeted custom services.

The cost of priming the pump

When I first learned about the NOAA program last year I began contacting individuals and companies that had publicly identified themselves at NOAA events as being interested in the program. It wasn’t hard to find out such public information as NOAA was openly gathering input from “the usual suspects,” i.e., hardware and software vendors, prime contractors and subcontractors, and others interested in Federal contracting such as consultants (like me). Since I’ve been working with companies already active in the open data space (for example, Balefire Global and Socrata), it wasn’t much of a stretch to recognize the relevance of open data systems and processes to NOAA’s plans.

Once I started talking with potential contractors and subcontractors it quickly became clear there was still a lot of uncertainty and skepticism out there about the NOAA program. I started to hear the same questions being asked about the NOAA program:

  1. How my going to make any money from this?
  2. How can I get my management to approve a bid if we’re not going to see any revenue for (insert number of months)?
  3. We’re a system integrator—who’s going to pay us?
  4. Won’t a requirement for “free public access” scare off potential investors?
  5. Let’s say we develop a commercial product using government data. Will we have to provide it to government agencies for free?
  6. This isn’t a standard procurement where the government goes out and purchases services. How will NOAA’s acquisition processes handle this?

My guess is that addressing questions like the above increased the length of time it took to get the program off the ground, despite the fact that commercial viability of NOAA data has long been a marketplace reality.

Admittedly, I approached the NOAA program with preconceived notions about how government procurement works and with some misgivings about the government government’s ability to manage a large distributed civilian program with many moving parts. It also became apparent to me that whether or not an organization would be interested in NOAA’s effort was not necessarily related to having “deep pockets” but to the ability to think long-term and to the ability to be agile and nimble. I concluded that interest in helping NOAA achieve its goals of increasing access to its data via this program is going to require not just strategic thinking but also a willingness to think and act creatively.

In other words, “Same old same old” just won’t cut it.

Open data program governance

One thing I’ve learned working with open data clients is that success is not always about the technology, it’s often about program management and governance. Open Data 1.0 portals can be very quick to spin up, but moving beyond that to where open data programs are effectively aligned with the goals and objectives of the sponsoring agencies is more of a management and governance challenge than a technology challenge.

True, technology can be a mighty enabler; just look at the infrastructure made possible by NOAA’s anchor partners. You could say that NOAA  is effectively outsourcing not only data management but data governance processes as well since companies such as Amazon Web Services, Google, and Microsft Azure will be managing the relationship with third-party vendors, developers, and marketers.

The advantage of this type of arrangement to NOAA and taxpayers is that the anchor partners agree to absorb costs in return for sharing in downstream new-product revenue. The potential disadvantage is that opportunities could be lost that might otherwise have surfaced in a more traditionally structured government services procurement.

Here are a few things to watch out for as the program moves forward:

  1. What mechanisms exist for NOAA program managers to keep track of the uses of data programs supplied via the new collaboration program?
  2. Given the possible increase in the number of stakeholders involved in the data management lifecycle, how will data and metadata standardization efforts be synchronized?
  3. What conditions if any will be placed on government use of the data provided via commercial vendors?
  4. What type of governing body—if any—will oversee NOAA initiated partnership efforts?
  5. What type of public transparency and reporting will such a governing body manage, given the potential mix of public and private interests?
  6. How closely will the costs to the public of the collaboration process be tracked and reported?

Conclusions

My initial skepticism about the NOAA program was probably based more on traditional expectations about how the government acquires and operates its services. NOAA is trying something different in order to expand the use of its programs’ data beyond what it can directly afford. That’s a good thing to do.

As an experiment it’s bound to have ups and downs. There’s no guarantee that making data more available for exploitation will result in commercially viable products. But such uncertainty has always been the case no matter what example we look at in the civilian and military side. Where would electronic miniaturization efforts be without the stimulus of World War II radar guided weapons research? Where would the civilian aerospace market be without military funding of jet engine technology? And where would the Internet be without DARPA funding?

I’m not suggesting that there will be similar advances coming out of the NOAA “big data” program. But as an example of an innovative approach to turning weather and environmental data into useful products and services, I’m looking forward optimistically to seeing how NOAA and its partners perform.

Related reading:

Copyright © 2015 by Dennis D. McDonald, Ph.D. 

Do People Really Understand What “Open Data” Means?

Do People Really Understand What “Open Data” Means?

Is Your Organization Ready for the Third Age of Open Data?

Is Your Organization Ready for the Third Age of Open Data?