McEwen, Leah; Bruno, Ian
The overarching goal of the WorldFAIR Chemistry Work Package (WP03) is to support the use of chemical data standards in research workflows, between and across disciplines. This will enable downstream data reuse through provision of practical direction and resources. The aim of this deliverable is to establish a framework that can be used by policymakers and developers of services and tools to support FAIR (Findable, Accessible, Interoperable and Reusable) reporting of chemical data. Specific objectives are to highlight applicability of existing standards at a practical level and to identify gaps that need to be addressed to achieve wider data re-use goals.
This report reviews some of the critical and persistent issues around documentation of chemical information. These were identified through a series of webinar panels on the theme: “What is a chemical?”, and through other conferences, workshops, and ongoing collaborative projects run as part of the WorldFAIR project and by the International Union of Pure and Applied Chemistry (IUPAC, the lead organisation of WP03).
Chemicals are everywhere and every tangible object has a chemical nature that impacts its use and behaviour in the environment. As chemical data and chemical principles are increasingly applied broadly across disciplines, the range of representations and contexts for chemical substances and data become more diverse and less easy to precisely define. Molecular entities are fundamental to our understanding of material properties and underlie the configuration of many chemical data models and resources, but it is also critical to look beyond the molecule to particles, surfaces, and states, and their behaviour under different conditions. Few chemistry-related disciplines have mature standards, and better practices in data reporting and interoperability are needed across the board, in both industry and academia. This will allow sharing and reuse requirements to be met in relation to international chemicals management policies and sustainable development goals.
This report additionally considers documentation requirements to achieve FAIR sharing of chemistry data in ways that are Reliable, Interpretable, Processable, and Exchangeable (RIPE), and with minimal loss of quality. Increasing the level of consumable FAIR data depends upon documenting data upstream of sharing to ensure that meaning and quality can be assessed and reassessed appropriately. It is not enough for data objects to be accessible; data need to be accompanied by metadata which provide the contextual information required to enable interoperability and reuse. Fully articulating the scope, structure, and exposure of metadata is critical to enable broader technical mechanisms for programmatic data exchange. The RIPE framework can help research ecosystems across sectors to focus on information requirements, resources and practices required to facilitate provision of data that are mature for sharing, and fully AI-ready across a broad range of use cases. Consistent and comprehensive communication of existing and emerging standards and resources is an important priority to effectively address the challenges confronting meaningful and effective reuse of chemistry data.
Collectively, the chemistry community has over a century of experience in developing and refining standards for communicating high quality chemical information. Explorations undertaken within and alongside WP03 are helping to clarify where these fall short of FAIR ideals, and how we can advance in addressing more complex needs across chemistry and other disciplines. While we have many of the components needed, further refinement of current processes and tools are necessary to enable establishment and use of workflows for sharing quality chemical data, particularly in interdisciplinary contexts. The present focus on the FAIR data principles provides a framework to enable previously well-established chemistry standards to become accessible and applicable for automated programmatic reuse. We envisage this report as a living document evolving over the course of the project, as we further assess IUPAC digital standards to support FAIR chemical data sharing. Future sections are planned that will provide a Roadmap and a Sustainability Blueprint for standards development and adoption as part of our collective recommendations for supporting chemical data reporting policy and practice.
The primary target audience for this report is the range of professionals involved in building and managing systems and services that support process engineers, scientists and other researchers working with data. We will also reach those involved in information management and communication, including professionals in publishing houses, libraries, standards organisations and at other information resources. Additional audiences include chemists, data scientists and other researchers who are actively working with informatics and programmatic applications, and those who are in positions to influence policy that impacts chemical data reporting and exchange.
Other deliverables under development in WorldFAIR Chemistry (WP03) will further demonstrate and facilitate the use of chemistry data standards, including a digital cookbook of interactive recipes demonstrating how to handle chemical data (D3.2 Training package), and protocol specifications for exchanging chemical representations and other metadata via Application Programming Interface (API) services (D3.3 Utility services).
WP03 activities are coordinated through the International Union of Pure and Applied Chemistry (IUPAC), the world authority on chemical nomenclature and terminology that constitute a common global language for communicating chemistry. In the context of the formal IUPAC process for reviewing and adopting consensus standards in chemistry, this work should be regarded as provisional guidance. Complete review and adoption of standards through IUPAC to reach the status of “Recommendation,” which has a specific meaning in the IUPAC lexicon, will occur after WP03 is complete.
The report is available on Zenodo.