(From Spring 2015 Newsletter) Dr. Rita Colwell, University of Maryland and Johns Hopkins University
A founding principle for GoMRI was to make data produced from the program discoverable and available to other scientists and the general public as soon as possible. Thus, the initiative has worked to capture and archive GoMRI data through the Gulf of Mexico Research Initiative Information and DataCooperative (GRIIDC), whose mission is to “ensure a data and information legacy that promotes continual scientific discovery and public awareness of the Gulf of Mexico ecosystem.”
The massive amount of data being generated is unprecedented for the Gulf of Mexico and will serve as a critical resource for scientists, decisions makers, and the public even after the program has ended.
There is also a growing national push for data archiving and availability, especially given the development of new technology to support this new open data environment. Constrained science budgets, increasingly complex problems to solve, and the growing costs of research are also helping to drive a cultural change among scientists. Interdisciplinary collaboration and shared resources are increasingly common. Data is potentially the most valuable resource increasingly being shared. Scientists from several disciplines can work from a single data set, increasing the amount of research that can be done at a reduced cost. However, an ongoing challenge exists for data consistency and continuity. Programs like GoMRI, where researchers are from diverse fields, exemplifies this challenge – if the data can be accessed, but is not in a usable format, what is the value of the data?
The science community is beginning to recognize and address this need for large, accessible, integrated data sets. Recently, NOAA announced it will be partnering with five web organizations, Amazon Web Services, Microsoft Azure, IBM, Google, and the Open Cloud Consortium through a Cooperative Research and Development Agreement (CRADA) to organize and make NOAA’s data more easily accessible and usable by anyone who wants to use it.
In May 2015, the National Academies released an RFP focused on “the use of existing data collected in the Gulf of Mexico and associated coastal communities to advance the understanding of environmental conditions, ecosystem services, and community health and well-being, including community vulnerability, recovery, and resilience.” The letter of intent submission period is currently open and will close on June 15, 2015.
The American Institute for Biological Sciences (AIBS), funded by the National Science Foundation, recently convened a series of workshops focused on data, urging scientists to include their data in publications, and also working with the community to identify ways to foster the integration of complex data, ranging from genomics and phenomics all the way to ecosystem and continental scale data. The summary from the first workshop is available here. The report from the second workshop expected this summer.
In May, the American Association for the Advancement of Science (AAAS) and the American Geophysical Union (AGU) held a workshop on “Reproducibility in the Field Sciences” focused on preservation and access to data used to support research publication. A primary goal of this discussion was to seek continuity across publishers in the expectation and enforcement of data support policies.
Data are critical to helping answer important research questions and make informed management and policy decisions. Access to data generated by GoMRI can make a huge difference when it comes to understanding, responding to, and mitigating future oil spills. GoMRI recognized this early on in the program’s development and GRIIDC serves as an excellent example of how data integration and consistency can provide exponential added value to the research community as a whole.