Open Research Data Policy Becomes More Concrete with NASA’s “Data Management Plans”
Go to an official description of NASA's research access policies and you’ll see that, from 2016 forward:
- Proposals to NASA for research funding must be accompanied by a “data management plan,” and
- Peer-reviewed journal articles based on NASA funded research must be submitted for public access to the Agency’s research portal.
Ignore for now the copyright and intellectual property concerns the second requirement raises, let's focus for now on the “data management plan” and what it requires.
This is the core paragraph describing that requirement:
All proposals or project plans submitted to NASA for scientific research funding will be required to include a DMP. The DMP should describe whether and how data generated through the course of the proposed research will be shared and preserved (including timeframe), or explain why data sharing and/or preservation are not possible or scientifically appropriate. At a minimum, DMPs must describe how data sharing and preservation will enable validation of published results or how such results could be validated if data are not shared or preserved.
Individual programs at NASA (e.g., Space Biology, Earth Sciences, Heliophysics, Planetary Sciences, etc.) reference the above paragraph defining the requirements for DMP’s but necessarily put their own spin on them. The sharing of research data is nothing new to NASA funded scientists and has evolved over many years; for example, see NASA's EOSDIS Distributed Active Archive Centers (DAACs) or how these centers will help to share data generated by the European Space Agency (ESA).
The history is both good and bad. It’s good since data sharing has such a healthy history. The movement to expand access to primary research data in all areas of science has grown substantially in recent years with NASA being a prime example. Check out version 1 of NASA’s Earth Science Division’s Guidelines for Development of a DATA MANAGEMENT PLAN (DMP) from 2011 and you’ll see the foundation for much of what appears in the now official NASA-wide policy. Under the section “Post-Mission Stewardship and Access,” for example, requirements are stated for the DMP to address a number of very practical requirements including transition to science data centers, directories and catalogs, standards and policies, and networking requirements.
The “bad” history – if it can be called that – is that, despite NASA historically being a very good at sharing data, the truth is that the many different programs and centers may have pursued their own research agendas and have built collaborative relationships with industry and academia both domestically and internationally in ways that may be difficult to monitor or control centrally (assuming that is required). For example, read a history like NASA's First A: Aeronautics from 1958–2008 and you will better understand the historical roots of NASA’s research decentralization and how this eventually conflicted with the more centralized oversight required once manned spaceflight took top priority over aeronautics.
That being said, NASA’s decision to require data management plans as a condition for research funding could accelerate the general trends towards more open science as the requirements for DMPs ripple through the various programs. A major question that NASA then needs to address (along with other Federal agencies that conduct research and generate data such as NOAA, EPA, DOT, and USDA) is the manner in which its data management plans will be governed, how they will be evaluated, and how they will be sustained.
Requiring that data be accessible assumes the existence of an infrastructure that supports ongoing maintenance and access for data and metadata. When many different organizations and centers manage data and metadata, someone somewhere is bound to wonder whether the overall money required to support distributed operations is being spent and managed efficiently.
There is also the question of how rigorously the DMP plan will be enforced. Will this just be one more hoop proposal writers need to jump through since they believe they need to pay "lip service" to a requirement they think they're already complying with anyway? Or will NASA be paying attention to how people actually implement such plans?
In those situations where a data sharing infrastructure does not already exist, who will design, build, and pay for it? What sort of private sector involvement will be required? In NOAA’s evolving open data program, for example, private sector cloud firms such as Amazon, Google, and Microsoft are working through ways to help commercialize access to increasing volumes of NOAA data that NOAA cannot afford, on its own, to make accessible. While one could argue that it is unlikely that commercial demand will arise quickly for the steady stream of data being sent back to Earth by Mars orbiters, the data being generated by NASA’s aeronautics research programs related to powerplants and fuel consumption may have commercial value in the much shorter term.
What role then should an agency play in making sure that such data are available for further research? How much responsibility for sustainable data management plans should be put on the researchers and their institutions for ongoing access? And how will such programs be paid for?
Copyright © 2016 by Dennis D. McDonald