Data systems

Challenges: Arctic data systems are operated by many actors, including individual polar research organizations, national data centres, regional and international data infrastructures, and global observing programmes within specific thematic domains. Some data systems are thematic, focusing on e.g. sea ice data, while others hold data from many disciplines, e.g. maintaining a broad range of observational, derived and simulated data for a particular geographic area. With many data systems that have evolved largely independently, a major challenge is that they offer different search and access mechanisms, and support different metadata and data formats. This makes it hard to integrate data from multiple systems, requiring complex technical implementation to make interoperability work in practice. A second major challenge is the gap between data collectors producing new datasets and data managers tasked with ensuring long term storage and open access. A mediator role between these two communities of scientists and technical experts needs to be established, enabling competence building in FAIR data management and providing support in documenting and formatting datasets. A third major challenge is the lack of overview of Arctic observing capacity, as the observing systems in this region are scattered and typically operated by research organizations in short term projects. The rapidly growing amounts of data from advances in observing and analytical technology is a further challenge in ensuring the Arctic is dutifully monitored and managed for sustainable development of this vulnerable region.’

Results from INTAROS: A main goal is that the data generated with support from the project must be stored in an established data repository. This will ensure long term access to the datasets, and with proper citation mechanisms (e.g. DOIs) for giving credit to the data collectors. The data managers in the project have supported data collectors in preparing data in standard formats, and in recommending suitable data repositories. Some scientific disciplines lack procedures and tools for documenting and formatting data in a standard way. To help remedy this lack of standardization for ocean mooring data, data managers and data collectors have joined forces to define a format compliant with established standards for metadata and data, such as NetCDF-CF, ACDD and OceanSITES. The resulting specification is publicly available for other ocean scientists to use for their mooring data, to support their production chains into Arctic data systems enabling data to search by different applications (see figure below). To provide a joint entry point to data and services developed in the project, the iAOS portal has been developed based on CKAN, a world leading open-source data management system. The main components of the iAOS portal include:

 

Legend: Data value chains for integrating INTAROS datasets into iAOS

Within INTAROS, focus has been on providing access to both historical and new datasets and held by partners or collaborating organizations. All datasets published with support from the project are registered in the INTAROS data catalogue. This catalogue provides a common entry point to these datasets, offering easy search of the data and links to the data repositories where the data are stored. Additional relevant Arctic datasets are harvested into a larger data catalogue in the iAOS portal, enabling users to search for both INTAROS data and data from external parties through a common catalogue, the iAOS portal data catalogue.

INTAROS carried out a survey of Arctic in situ observing systems with the aim to assess the observing capacity of key parameters within different disciplines, including ocean and sea ice, terrestrial and atmosphere. To maintain and extend this important information NERSC developed a user-friendly web system ARCMAP for collecting and updating descriptions of Arctic in situ observing systems. ARCMAP provides an overview of the location and central characteristics of the assessed systems  and generates a set of plots summarizing statistics of their usage or capabilities (see figures below).

The iAOS cloud platform facilitates for large amounts of data to be integrated and analyzed using state-of-the-art cloud technology. The use of this cloud platform is demonstrated through showcases combining data from partners with data from external parties, and jointly processing them in the “Ellip Solutions” platform.

Legend: Overview map of in situ observing systems, individual and clustered, developed in the ARCMAP project. The map provides with popup “bubbles” of central information about the observing systems.

Legend: Statistics on the distribution of ocean and sea ice observing systems betwen application areas.