The AtMoDat project

Atmospheric Model Data: Data Quality, Curation Criteria and DOI Branding.

Project Aim

Atmospheric models are a relevant element of climate research. Access to this data is not only of interest to a wide scientific community but also to public services, companies, politicians and citizens. One way to make the data available is to publish them via a data repository. To ensure that datasets in a repository are indeed Findable, Accessible, Interoperable, and Reusable (i.e.FAIR[1]), it is essential that the data are stored together with detailed metadata and that the file structure and metadata follow an established standard. Furthermore, datasets are easier to find when the corresponding metadata is machine-readable and uses a standardised vocabulary.

While data standardization is well established in large, internationally coordinated model intercomparison projects (e.g. for climate models in CMIP[2]), joint standards are still lacking in many atmospheric modelling sub-disciplines, such as e.g. urban climate or cloud-resolving modelling.

The AtMoDat project, led by a team of atmospheric scientists and infrastructure providers, aims to improve the overall FAIRness of atmospheric model data and thus promote their re-use. Within the project, the ATMODAT standard has been developed which includes precise recommendations to achieve enhanced FAIRness of atmospheric model data in repositories. A prerequisite of this standard is that the data are published with a DataCite[3] DOI. The ATMODAT standard specifies requirements for rich metadata with controlled vocabularies, structured landing pages (human- and machine-readable), and the format and structure of the data files.

The ATMODAT standard is easy to implement and provides checklists for data curators and data producers. In addition, to facilitate the compliance check with the ATMODAT standard, the atmodat data checker has been developed. A dataset that complies with this standard will follow the FAIR principles and its metadata will be of high quality. If this compliance has been verified by the respective repository, the dataset can be labelled with the Earth System Data Branding (EASYDAB). This branding makes it easy for users to verify that the data are properly curated and the metadata have been quality assured.

Assigning DataCite DOIs and metadata to datasets are the first steps to meet the FAIR principles. Currently, however, the DataCite Metadata Schema does not provide an explicit property to store (or link to) information on data maturity. This information can comprise e.g. the results of various quality control checks that have been performed upon the data and/or metadata prior to their publication. Within the AtMoDat project, we have recently proposed an extension to the DataCite metadata schema. The extension consists of a new Maturity Indicator property that would allow placing the results of data maturity checks in the DataCite Metadata in a standardised format.


[1] Wilkinson et al., 2016: https://doi.org/10.1038/sdata.2016.18
[2] Juckes et al., 2020: https://doi.org/10.5194/gmd-13-201-2020
[3] https://datacite.org

Project Partners and Funding

The AtMoDat team consists of four project partners: German Climate Computing Center (DKRZ), Technische Informationsbibliothek (TIB) and the Universities of Leipzig (Ulei) and Hamburg (UHH).

AtMoDat consortium logos 4x1

The project is funded by the German Federal Ministry of Education and Research (BMBF) within the framework of "Forschungsvorhaben zur Entwicklung und Erprobung von Kurationskriterien und Qualitätsstandards von Forschungsdaten". 

ATMODAT Standard

A quality guideline to improve the FAIRness of atmospheric model data.

atmodat data checker

A Python tool that checks your data files for compliance with the ATMODAT standard.

AtMoDat Team

The AtMoDat team consists of four project partners: TIB, DKRZ, UHH and ULei.

EASYDAB

EASYDAB, the new Earth System Data Branding to highlight carefully curated data publications.

News