The AtMoDat project
Atmospheric Model Data: Data Quality, Curation Criteria and DOI Branding.
Project Aim
Atmospheric models are a relevant element of climate research. Access to this data is not only of interest to a wide scientific community but also to public services, companies, politicians and citizens. One way to make the data available is to publish them via a data repository. To ensure that datasets in a repository are indeed Findable, Accessible, Interoperable, and Reusable (i.e.FAIR[1]), it is essential that the data are stored together with detailed metadata and that the file structure and metadata follow an established standard. Furthermore, datasets are easier to find when the corresponding metadata is machine-readable and uses a standardised vocabulary.
While data standardization is well established in large, internationally coordinated model intercomparison projects (e.g. for climate models in CMIP[2]), joint standards are still lacking in many atmospheric modelling sub-disciplines, such as e.g. urban climate or cloud-resolving modelling.
The AtMoDat project, led by a team of atmospheric scientists and infrastructure providers, aims to improve the overall FAIRness of atmospheric model data and thus promote their re-use. Within the project, the ATMODAT standard has been developed which includes precise recommendations to achieve enhanced FAIRness of atmospheric model data in repositories. A prerequisite of this standard is that the data are published with a DataCite[3] DOI. The ATMODAT standard specifies requirements for rich metadata with controlled vocabularies, structured landing pages (human- and machine-readable), and the format and structure of the data files.
The ATMODAT standard is easy to implement and provides checklists for data curators and data producers. In addition, to facilitate the compliance check with the ATMODAT standard, the atmodat data checker has been developed. A dataset that complies with this standard will follow the FAIR principles and its metadata will be of high quality. If this compliance has been verified by the respective repository, the dataset can be labelled with the Earth System Data Branding (EASYDAB). This branding makes it easy for users to verify that the data are properly curated and the metadata have been quality assured.
Assigning DataCite DOIs and metadata to datasets are the first steps to meet the FAIR principles. Currently, however, the DataCite Metadata Schema does not provide an explicit property to store (or link to) information on data maturity. This information can comprise e.g. the results of various quality control checks that have been performed upon the data and/or metadata prior to their publication. Within the AtMoDat project, we have recently proposed an extension to the DataCite metadata schema. The extension consists of a new Maturity Indicator property that would allow placing the results of data maturity checks in the DataCite Metadata in a standardised format.
[1] Wilkinson et al., 2016: https://doi.org/10.1038/sdata.2016.18
[2] Juckes et al., 2020: https://doi.org/10.5194/gmd-13-201-2020
[3] https://datacite.org
Project Partners and Funding
The AtMoDat team consists of four project partners: German Climate Computing Center (DKRZ), Technische Informationsbibliothek (TIB) and the Universities of Leipzig (Ulei) and Hamburg (UHH).
The project is funded by the German Federal Ministry of Education and Research (BMBF) within the framework of "Forschungsvorhaben zur Entwicklung und Erprobung von Kurationskriterien und Qualitätsstandards von Forschungsdaten".
News
AtMoDat meets GeoKur: 28 Sep 2021
We met with our sister project GeoKur
Workshop on 9 Nov 2021: Let's standardize your data!
Time: 10:00 – 16:00. Please join us and register!
AtMoDat @ METTOOLS: 22 Sept 2021
The AtMoDat team contributes with a talk and a user workshop 16:00–17:30 CEST