Project outcomes and documents

Management (WP1) & Synthesis (WP2)

WorldFAIR First Policy Brief (D1.3)

Authors: Hodson, Simon; Gregory, Arofan

In this policy brief the WorldFAIR project makes seven policy recommendations relevant to EOSC. Evidence and analysis is presented for each recommendation. The Policy Brief and recommendations draw on project deliverables and discussions held at workshops including project participants and wider stakeholders. The policy brief also includes a short report on progress made by the project.

FAIR Implementation Profiles (FIPs) in WorldFAIR: What Have We Learnt? (D2.1)

Authors: Gregory, Arofan; Hodson, Simon

Report on the FAIR Implementation Profiles completed by project Case Studies in 2022.  Project Deliverable D2.1 for EC WIDERA-funded project “WorldFAIR: Global cooperation on FAIR data policy and practice”.This report gives a brief overview of the experience of the WorldFAIR project in using FAIR Implementation Profiles (FIPs).  It describes the WorldFAIR project, its objectives and its rich set of Case Studies; and it introduces FIPs as a methodology for listing the FAIR implementation decisions made by a given community of practice. Subsequently, the report gives an overview of the initial feedback and findings from the Case Studies, and considers a number of issues and points of discussion that emerged from this exercise. Finally, and most importantly, we describe how we think the experience of using FIPs will assist each Case Study in its work to implement FAIR, and will assist the project as a whole in the development of two key outputs: the Cross-Domain Interoperability Framework (CDIF), and domain-sensitive recommendations for FAIR assessment.

We hope this report will be of interest to data experts who want to find out more about the WorldFAIR project, its remarkable and diverse array of Case Studies, and about FIPs.  It is important to stress that this report does not set out to give a comprehensive appraisal of the FIPs approach and could not do so.  All the WorldFAIR Case Studies have developed an initial FIP, but the process of reflection on practice will continue throughout the project.  Each Case Study will complete at least one further FIP, and in some cases more than one, towards the end of the project and this will enrich our understanding of the utility of the approach.  At that stage, we intend to be able to incorporate some robust prospective and aspirational considerations, and we need to consider how best to represent this in the FIPs.As noted above, the final section of this report looks forward to the development of the Cross-Domain Interoperability Framework (CDIF), and domain-sensitive recommendations for FAIR assessment.  On both these counts, we consider that the FIPs approach has helped considerably:

For the CDIF, through helping refine our initial functional analysis of the requirements for cross-domain FAIR, and—as predicted—helping identify some candidate cross-domain standards.  

For the FAIR assessment recommendations, through demonstrating that the FIPs can provide an empirical basis for such recommendations, reflecting both the current practice, and the aspirations of a given community or research domain.

Chemistry (WP3)

Digital Recommendations For Chemistry FAIR Data Policy And Practice

Authors: McEwen, Leah; Bruno, Ian

The overarching goal of the WorldFAIR Chemistry Work Package (WP03) is to support the use of chemical data standards in research workflows, between and across disciplines. This will enable downstream data reuse through provision of practical direction and resources. The aim of this deliverable is to establish a framework that can be used by policymakers and developers of services and tools to support FAIR (Findable, Accessible, Interoperable and Reusable) reporting of chemical data. Specific objectives are to highlight applicability of existing standards at a practical level and to identify gaps that need to be addressed to achieve wider data re-use goals. 

This report reviews some of the critical and persistent issues around documentation of chemical information. These were identified through a series of webinar panels on the theme: “What is a chemical?”, and through other conferences, workshops, and ongoing collaborative projects run as part of the WorldFAIR project and by the International Union of Pure and Applied Chemistry (IUPAC, the lead organisation of WP03). 

Chemicals are everywhere and every tangible object has a chemical nature that impacts its use and behaviour in the environment. As chemical data and chemical principles are increasingly applied broadly across disciplines, the range of representations and contexts for chemical substances and data become more diverse and less easy to precisely define. Molecular entities are fundamental to our understanding of material properties and underlie the configuration of many chemical data models and resources, but it is also critical to look beyond the molecule to particles, surfaces, and states, and their behaviour under different conditions. Few chemistry-related disciplines have mature standards, and better practices in data reporting and interoperability are needed across the board, in both industry and academia. This will allow sharing and reuse requirements to be met in relation to international chemicals management policies and sustainable development goals.  

This report additionally considers documentation requirements to achieve FAIR sharing of chemistry data in ways that are Reliable, Interpretable, Processable, and Exchangeable (RIPE), and with minimal loss of quality. Increasing the level of consumable FAIR data depends upon documenting data upstream of sharing to ensure that meaning and quality can be assessed and reassessed appropriately. It is not enough for data objects to be accessible; data need to be accompanied by metadata which provide the contextual information required to enable interoperability and reuse. Fully articulating the scope, structure, and exposure of metadata is critical to enable broader technical mechanisms for programmatic data exchange. The RIPE framework can help research ecosystems across sectors to focus on information requirements, resources and practices required to facilitate provision of data that are mature for sharing, and fully AI-ready across a broad range of use cases. Consistent and comprehensive communication of existing and emerging standards and resources is an important priority to effectively address the challenges confronting meaningful and effective reuse of chemistry data. 

Collectively, the chemistry community has over a century of experience in developing and refining standards for communicating high quality chemical information. Explorations undertaken within and alongside WP03 are helping to clarify where these fall short of FAIR ideals, and how we can advance in addressing more complex needs across chemistry and other disciplines. While we have many of the components needed, further refinement of current processes and tools are necessary to enable establishment and use of workflows for sharing quality chemical data, particularly in interdisciplinary contexts. The present focus on the FAIR data principles provides a framework to enable previously well-established chemistry standards to become accessible and applicable for automated programmatic reuse. We envisage this report as a living document evolving over the course of the project, as we further assess IUPAC digital standards to support FAIR chemical data sharing. Future sections are planned that will provide a Roadmap and a Sustainability Blueprint for standards development and adoption as part of our collective recommendations for supporting chemical data reporting policy and practice.

The primary target audience for this report is the range of professionals involved in building and managing systems and services that support process engineers, scientists and other researchers working with data. We will also reach those involved in information management and communication, including professionals in publishing houses, libraries, standards organisations and at other information resources. Additional audiences include chemists, data scientists and other researchers who are actively working with informatics and programmatic applications, and those who are in positions to influence policy that impacts chemical data reporting and exchange.

Other deliverables under development in WorldFAIR Chemistry (WP03) will further demonstrate and facilitate the use of chemistry data standards, including a digital cookbook of interactive recipes demonstrating how to handle chemical data (D3.2 Training package), and protocol specifications for exchanging chemical representations and other metadata via Application Programming Interface (API) services (D3.3 Utility services).  

WP03 activities are coordinated through the International Union of Pure and Applied Chemistry (IUPAC), the world authority on chemical nomenclature and terminology that constitute a common global language for communicating chemistry. In the context of the formal IUPAC process for reviewing and adopting consensus standards in chemistry, this work should be regarded as provisional guidance. Complete review and adoption of standards through IUPAC to reach the status of “Recommendation,” which has a specific meaning in the IUPAC lexicon, will occur after WP03 is complete.

Nanomaterials (WP4)

Nanomaterials Domain-Specific FAIRification Mapping (D4.1)

Authors: Lynch, Iseult; Afantitis, Antreas; Exner, Thomas; Papadiamantis, Anastasios

WP04 of WorldFAIR aims to increase the FAIRness (Findability, Accessibility, Interoperability and Reusability) of nanomaterials datasets and computational models. The initial focus is on toxicity and safety-related datasets as this is where the bulk of the effort has been to date.  We note that nanomaterials are a very broad category of materials, combining chemicals and materials features, and overlapping strongly with the emerging domain of advanced materials. Nanosafety is a very broad domain, covering exposure, toxicity and risk assessment and requires extensive characterisation of the pristine (as-produced) nanomaterials and their physical, chemical, biological and macromolecular transformations within the various environments in which they are present. Thus, the positioning of nanomaterials between Chemistry (WP03) and Geochemistry (WP05) is intentional, as there are strong overlaps of concepts and approaches with both, and solutions applicable to one of these domains are likely to be applicable to others also, leading to the potential for mutual learning and accelerated implementation of the FAIR concepts across these domains.  

This specific deliverable lays out the domain and its communities, and the various projects and contributors active in the FAIR-nanomaterials and nanosafety domain. It then presents our initial FAIR implantation Profile (FIP) which describes the current state of the field (an ‘As-Is’ FIP) and discusses the domain-specific challenges relating to nanomaterials and its FAIR landscape. The deliverable then lays out the developments needed to reach the ‘To-Be’ FIP, as the optimal approach to make nanomaterials and nanosafety data FAIR, based on current best practice across the FAIR community.  We note that the As-Is FIP will be updated at the end of the WorldFAIR project, to capture: the rapid development in the field; our own efforts to enhance the number of domain-specific FAIR-enabling Resources (FERs) and FAIR-supporting resources (FSRs); FIPs underway in the aforementioned WorldFAIR Chemistry (WP03) and Geochemistry (WP05) case studies, as well as the efforts underway in the Partnership for Assessment of the Risks of Chemicals (PARC), which has a very strong FAIR data focus.  Subsequent activities in WP04 will implement a case study to foster development and piloting of interoperability standards, and guidelines for increasing FAIRness in the interlinked scientific disciplines of chemical toxicity, nanomaterials toxicity and materials modelling.

The FAIR mapping represents a critical step towards identifying both the domain-specific features and the general features needed to maximise nanosafety data and model FAIRness, highlighting areas for further development and standardisation especially in the domain-specific aspects such as metadata standards and ontologies. Existing ontologies have major gaps in their semantic coverage, and project-specific terminology is not fully integrated/accessible via ontology look-up services. Building on the mapping and ‘As-Is’ FIP for nanosafety, WP04 will develop an index (registry) and workflows for FAIRification of nanoinformatics tools and models (D4.2), and recommendations for nanomaterials-specific human and machine-readable provenance and persistence policies (D4.3).       

This nanomaterials FIP, and its subsequent iteration, form a critical part of the overarching WorldFAIR cross-domain mapping and indeed have already been integrated into D2.1 and D11.1.

Geochemistry (WP5)

Formalisation of OneGeochemistry (D5.1)

Report on the formalisation of the OneGeochemistry CODATA Working Group. 

Project Deliverable D5.1 for EC WIDERA-funded project “WorldFAIR: Global cooperation on FAIR data policy and practice”.

The WorldFAIR Geochemistry Work Package Deliverable 5.1 sets out to formalise the OneGeochemistry Initiative. With the exponential growth of data volumes and production, better coordination and collaboration is needed within the Earth and Planetary Science community producing geochemical data. The mission of OneGeochemistry is to address this need and in order to do so effectively the OneGeochemistry Interim Board has applied to become the OneGeochemistry CODATA Working Group. This application has been approved by the CODATA Executive Committee. The OneGeochemistry CODATA Working Group will be led by a chair and co-chair and will form expert advisory groups where required. Becoming a CODATA Working Group gives the OneGeochemistry Initiative credibility and authority to successfully pursue a long-term governance structure and accomplish the other WorldFAIR deliverables of WP05 (Geochemistry). 

Accomplishing an outline of the methodology used to populate and update FAIR Implementation Profiles and to promulgate knowledge of them, as well as creating a set of guidelines for laboratories and repositories on how to use FAIR Implementation Profiles and common variables to QA/QC data, will enable FAIRer (Wilkinson et al., 2016) geochemical data, which will in turn make interdisciplinary use easier. 

Geochemical data has direct application to six of the seventeen UN Sustainable Development Goals (SDG#6 (Clean Water and Sanitation); SDG#7 (Affordable and Clean Energy); SDG#8 (Decent Work and Economic Growth); SDG#9 (Industry, Innovation and Infrastructure); SDG#13 (Climate Action); SDG#15 (Life on Land) and FAIR geochemical data will accelerate the generation of new geoscientific knowledge and discoveries. Within the greater framework of the WorldFAIR project, this deliverable has come together in collaboration with CODATA (WP01 and WP02) and the International Union of Pure and Applied Chemistry (IUPAC, WP03). 

Geochemistry Scientific Content Component (Milestone)

Prent, Alexander; Wyborn, Lesley; Farrington, Rebecca; Lehnert, Kerstin; Klöcking, Marthe; Elger, Kirsten; Hezel, Dominik NFDI4Earth; ter Maat, Geertje; Profeta, Lucia

WorldFAIR Milestone 6, reported here, specifies work done and being undertaken for Deliverable 5.2 (due month 20), ‘Geochemistry Methodology and Outreach’, which has the following description: “This deliverable will outline the methodology used to develop and update FIPs and promulgate knowledge of them, including publishers to ensure the quality, interoperability and reusability of data in publications”. 

As geochemical data is collected on a diversity of natural and synthetic samples (rocks, sediments, minerals, fossils, meteorites, cosmic dust, fluids, gases, etc), from the Earth or other planetary bodies, there is an incredible range of analytical instruments used and hundreds of analytical techniques applied. This results in a community with many subdisciplines that produce typically ‘long tail’ data – data that are highly specific and small in volume. The community and the data produced are heterogeneous and overlaps of common minimum variables are scarce. 

We conclude that developing a single FAIR Implementation Profile (FIP) for all geochemical data will not be possible; rather, there will need to be multiple linked FIPs for geochemistry subdisciplines and at multiple levels of granularity. As a FIP is underpinned by FAIR Enabling Resources (FERs), many such FERs need to be publicly available or need to be published. By specifying any FER(s) that accompany each FAIR principle within the individual FIP, users of any geochemical dataset/database will have accurate documentation for each FAIR Principles, and thus enhance machine readability.

This Milestone describes progress towards developing a methodology designed to assist in defining the individual FERs required to fully describe the minimum scientific and technical variables used to describe any geochemical analysis. These FERs will enable the generation of multiple FIPs, facilitating published results to be reproduced and shared globally with sufficient metadata to make any geochemical resource FAIR for both humans and machines. 

This Milestone report then discusses how the components of this methodology are being executed in the community, discusses resulting progress towards minimum common variables of samples, discusses how to make best practices for geochemical methods available online and specifies a set of vocabularies published to describe methodologies.

Social Surveys (WP6)

Cross-national social sciences survey FAIR implementation case studies

McEachern, Steven; Orten, Hilde; Thome Petersen, Hanna; Perry, Ryan

This report provides an overview of the data harmonisation practices of comparative (cross-national) social surveys, through case studies of: (1) the European Social Survey (ESS) and (2) a satellite study, the Australian Social Survey International – European Social Survey (AUSSI-ESS). To do this, we compare and contrast the practices between the Australian Data Archive and, the organisations responsible for the data management of ESS and AUSSI-ESS. 

The case studies consider the current data management and harmonisation practices of study partners in the ESS, including an analysis of the current practices with FAIR data standards, particularly leveraging FAIR Information Profiles (FIPs) and FAIR Enabling Resources (FERs). 

The comparative analysis of the two case studies considers key similarities and differences in the management of the two data collections. Core differences in the use of standards and accessible, persistent registry services are highlighted, as these impact on the potential for shared, integrated reuse of services and content between the two partner organisations.

The report concludes with a set of recommended practices for improved management and automation of ESS data going forward—setting the stage for Phase 2 of WorldFAIR Work Package 6—and outlines the proposed means for implementing this management in the two partner organisations. These recommendations focus on three areas of shared interest:

Aligning standards

Establishing common tools

Establishing and using registries

in order to advance implementation of the FAIR principles, and to improve interoperability and reusability of digital data in social sciences research. 

Population Health (WP7)

Population Health Data Implementation Guide

Gregory, Arofan; Todd, Jim; Amadi, David; Greenfield, Jay; Muyingo, Sylvia; Tomlin, Keith

One of the key requirements for FAIR data reuse is that the user of a FAIR data resource understands the exact nature of the data. The FAIR principles talk about the kinds of metadata needed to describe data, but it is necessary for implementers to understand how these metadata can be provided, to effectively realise FAIR within their systems. This implementation guide describes the way all aspects of the data are made available for use, both within and from outside the INSPIRE Network community, using standard metadata to describe the data. This is an exploration of how generic standards can be used to express the agreed community metadata set. The INSPIRE platform supports network studies using population health data to stand up their own instances of a common data model called the OMOP CDM. The WorldFAIR project is an exploration to facilitate a better understanding of what is needed for data infrastructures to provide data in line with the FAIR principles within and across domains.

The types of metadata used in INSPIRE are aligned as much as possible with existing and popular models common in the public health domain. Primary among these are the standards (and tools) coming from OHDSI (Observational Health Data Sciences and Informatics), notably their OMOP Common Data Model (CDM). This suite of products addresses the definition of specific concepts and their semantics, standard (primarily medical) classifications, and the mechanism for selecting data from among those available to produce a specific cohort for analysis. These standards are common within the public health domain internationally, and INSPIRE has chosen to use them to reduce the significant cost of developing tools for many aspects of data and metadata management and use.

FAIR demands that we provide data in a useful way to those who may not be familiar with the community tools and standards used by INSPIRE. More generic standards are thus needed to support this broader community. It is significant that members of the OHDSI community have already looked at how – developed and supported by many popular search engines, Google foremost among them – can be used in combination with the OHDSI OMOP CDM to describe data resources. Here, INSPIRE builds on that work to describe how INSPIRE data resources, specifically, can be documented in a way which will be maximally accessible to users both within the community and external to it.

One critical part of the overall information set provided by standard FAIR metadata is a description of the experiment for which the data was used, and the protocol employed in the selection and analysis of the data. This aspect of the metadata description is a major focus of the implementation guide, and one for which would seem to be well-suited.

WorldFAIR WP (Work Package) 07 is one of eleven domain-specific case studies being undertaken by the WorldFAIR project, with the domain-specific practices being analysed across these domains in WP02. Early indications from WP02 suggest that is one of the standards which will be recommended as part of the Cross-Domain Interoperability Framework (CDIF). This implementation guide contributes to an understanding of exactly how fits into the description of domain data. 

While some open questions remain, the implementation guide has achieved its primary goal of showing how standards such as can be used within the public health domain to provide a complete set of the information needed for FAIR data use across and within domain boundaries.

Urban Health (WP8)

Urban Health Data – Guidelines And Recommendations (D8.1)

Ortigoza, Ana

This report provides a summary of actions and findings of the Urban Health Mapping and Assessment (Task 8.1) for WorldFAIR Work Package 08. Firstly, we assessed the implementation of FAIR principles within the Urban Health field, through two case studies: 1) the elaboration of a web data platform for the SALURBAL project (Urban Health in Latin America); and 2) the elaboration of a FAIR Implementation Profile (FIP) for the Urban Health discipline in general.  Secondly, we focused on the data collection and harmonisation process of health survey data that was led by the SALURBAL team and allowed the elaboration of consensus on terminologies and procedures that facilitates the use of survey health data in cities for research and action.

The FAIR Implementation Profile for the SALURBAL case study contributed to the renovation process that the data system was undergoing, offering valuable guidance on good practices currently possible for making data FAIR. The elaboration and documentation of standard procedures used in data and metadata identification for the SALURBAL web platform not only contributed to their findability (‘F’) but also the access and reuse of the data (‘A’, ‘R’).   For the Urban Health discipline, the FAIR Implementation Profile shed light on the lack of a common repository for urban health data, and showed that most urban health data should be encoded following DDI standards.  It also showed the inconsistent process of identifying data and metadata used in urban health. Lessons learned during the process support recommendations towards 1. the promotion of a deeper understanding of the FAIR principles among urban health researchers and practitioners; and 2. the systematisation of a data management plan during the design and initial steps of a research project that can guide the implementation of FAIR principles across different domains and working groups. The results of the SALURBAL (WorldFAIR WP08) FIP were used to create a web resource (A FAIR Primer) which contextualises the activity, provides information about the FAIR principles, relays some of the FIP’s findings and provides guidance for making SALURBAL data more FAIR.  This resource is included here as Appendix A.

Regarding health survey data, we found challenges in the harmonisation process that may be difficult for the use and interoperability of data between countries and within countries across time such as 1. disagreement in the definition of risk factors; 2. lack of consistency in categories or measurement units used for an indicator; 3. discrepancy in scales and questionnaires used for retrieving information about similar health behaviours or health outcomes. We leveraged the SALURBAL experience during this harmonisation process to propose a guideline for future harmonisation in health survey data in which the main recommendations are focused on the need to 1. generate consensus on definition and measurements in health data; and 2. revisit questionnaires and scales commonly used for some health behaviours and establish commitment on common uses.

The development of these deliverables for Task 8.1 made visible the gaps and needs of FAIR implementation in the urban health research community. Consequently, we will design and develop dissemination and training materials that can support and guide research and practitioners

Biodiversity (WP9)

Data Standard For Sharing Ecological And Environmental Monitoring Data Documented For Community Review (D9.1)

Authors: Miller, Joe; Robertson, Tim; Wieczorek, John

Deliverable 9.1 for the WorldFAIR Project’s Biodiversity Work Package (WP09). Biodiversity standards are essential for FAIR data, in particular for interoperability.  Current standards need to be improved with new data models to better reflect the complexity of biodiversity and serve the information needed to address biodiversity loss and climate change. 

This Deliverable D9.1, focused on Task 9.1, describes the FAIR data model being developed in WorldFAIR WP09 with the Global Biodiversity Information Facility (GBIF) leading a community collaboration.  

Facilitated by the WorldFAIR project, GBIF’s engagement with the biodiversity community has led to this Deliverable – a new draft core Unified Model. The model was developed in collaboration with the Biodiversity Information Standards Group (TDWG) and through community consultation via webinars, open drafting of documents and solicitation of test datasets from the various stakeholders. A review of comparable standards has led to the development of a draft core model framework that is known to the community which should make adoption easier. 

The new model is centred around the ‘Event’ – something happened at some place during some period of time, optimally described by a protocol. This conclusion is based on research which describes how successful models are expressed and the flexibility of the Event to accommodate many types of data. The Unified Model is applicable to all currently-used data types and potential new data to be shared with GBIF. The current community engagement approach (more on which in D9.2) is to test individual components of the model with engagement activities and example datasets. This will continue with new tests expanding the potential utility of the model.

The WP tasks performed to date (collation of previous material, engagement with TDWG and Darwin Core (DwC) standard leads, webinars, use of shared documents to build use cases, building of exemplar datasets for collection management systems, and provision of several avenues for feedback) have resulted in the new provisional model currently under consultation. Testing and community engagement to date indicates that the model will better reflect the complexity of biodiversity data leading to more efficient use of our community’s data in research and policy. The feedback we have received also indicates a steep learning curve for the future implementation of the data model. This feedback is essential for the development of publishing tools.   

This work aligns with the overall objectives of WorldFAIR by focused development on improving the interoperability of biodiversity data. Our FAIR Implementation Profile (FIP) will be enhanced by this improved functionality. This work promotes cross-domain interaction, as the Unified Model will enhance sharing of data in related Work Packages such as Agricultural Biodiversity, Oceans, and Geochemistry in the final portion of the WorldFAIR grant period. This work has been undertaken in alignment with the overall WorldFAIR goals, in particular WP02 on Engagement, Synthesis, Recommendations and FAIR Assessment. 

Agricultural Biodiversity (WP10)

Agriculture-related pollinator data standards use cases report (Deliverable 10.1)

Trekels, Maarten; Pignatari Drucker, Debora; Salim, José Augusto; Ollerton, Jeff; Poelen, Jorrit; Miranda Soares, Filipi; Rünzel, Max; Kasina, Muo; Groom, Quentin; Devoto, Mariano

Although pollination is an essential ecosystem service that sustains life on Earth, data on this vital process is largely scattered or unavailable, limiting our understanding of the current state of pollinators and hindering effective actions for their conservation and sustainable management. In addition to the well-known challenges of biodiversity data management, such as taxonomic accuracy, the recording of biotic interactions like pollination presents further difficulties in proper representation and sharing. Currently, the widely-used standard for representing biodiversity data, Darwin Core, lacks properties that allow for adequately handling biotic interaction data, and there is a need for FAIR vocabularies for properly representing plant-pollinator interactions. Given the importance of mobilising plant-pollinator interaction data also for food production and security, the Research Data Alliance Improving Global Agricultural Data Community of Practice has brought together partners from representative groups to address the challenges of advancing interoperability and mobilising plant-pollinator data for reuse. This report presents an overview of projects, good practices, tools, and examples for creating, managing and sharing data related to plant-pollinator interactions, along with a work plan for conducting pilots in the next phase of the project.

We present the main existing data indexing systems and aggregators for plant-pollinator interaction data, as well as citizen science and community-based sourcing initiatives. We also describe current challenges for taxonomic knowledge and present two data models and one semantic tool that will be explored in the next phase. In preparation for the next phase, which will provide best practices and FAIR-aligned guidelines for documenting and sharing plant-pollinator interactions based on pilot efforts with data, this Case Study comprehensively examined the methods and platforms used to create and share such data. By understanding the nature of data from various sources and authors, the alignment of the retrieved datasets with the FAIR principles was also taken into consideration. We discovered that a large amount of data on plant-pollinator interaction is made available as supplementary files of research articles in a diversity of formats and that there are opportunities for improving current practices for data mobilisation in this domain. The diversity of approaches and the absence of appropriate data vocabularies causes confusion, information loss, and the need for complex data interpretation and transformation. Our explorations and analyses provided valuable insights for structuring the next phase of the project, including the selection of the pilot use cases and the development of a ‘FAIR best practices’ guide for sharing plant-pollinator interaction data. This work primarily focuses on enhancing the interoperability of data on plant-pollinator interactions, envisioning its connection with the effort WorldFAIR is undertaking to develop a Cross-Domain Interoperability Framework.

Ocean Science & Sustainable Development (WP11)

An assessment of the ocean data priority areas for development and implementation roadmap

After an introduction to the Intergovernmental Oceanographic Commission of UNESCO’s Ocean Data and Information System (ODIS), the report summarises an evaluation of FAIR Implementation Profiles and FAIR Enabling Resources compiled from across WorldFAIR case studies. It then synthesises supplementary insights obtained through a survey distributed across project partners, and identifies a pathway to implement sustainable cross-domain (meta)data flows to inform and support the current development of the Cross-domain Interoperability Framework (CDIF), a major output of WorldFAIR. 

The WorldFAIR case studies on biodiversity, disaster risk reduction, chemistry, and cultural heritage were identified as focal points to bridge with ODIS, being complementary to the strategic priorities of marine science and sustainable ocean management and offering clear socio-technical interfaces compatible with ODIS’s own interoperability approaches. The high-level roadmap in this report outlines the general approach that will be pursued in the remaining tasks in the ocean science case study.  

In summary, this report draws from current data practices insight from the international, multi-domain WorldFAIR consortium to identify the most viable routes to establish and sustain cross-domain data interoperability.

Disaster Risk Reduction (WP12)

Disaster Risk Reduction Case Study Report

Bolland, Jill; Fakhruddin, Bapon; Reinen-Hamill, Richard

This report describes the types of data used for disaster risk reduction (DRR) and provides two country case studies, for Fiji and Sudan, with an in-depth look at the DRR datasets and associated metadata used by each country. These datasets were assessed against 15 FAIR (Findable, Accessible, Interoperable, Reusable) data metrics to identify which elements of FAIR were met. The report also provides a broader context giving details on the national, regional, and global agencies providing or hosting DRR data as well as initiatives aiming to increase the FAIRness of DRR data. 

Both of our case study countries are using remote sensing data which were assessed as having the richest metadata and met most of the FAIR metrics used in the assessment. Strategies for exploiting this data are discussed as they have great potential to provide up to date information during an emergency and to fill gaps in DRR data.   

An essential task for any scientific discipline is the establishment of common standards and terminologies. Historically, standards have differed considerably with agencies creating standards and vocabularies based on their own use cases and priorities; consequently, there is currently no universal standard used by all DRR practitioners. We discuss the most widely used standard definitions and provide suggestions for harmonising standards. As both the United National Nations Office for Disaster Risk Reduction (UNDRR) and the World Meteorological Organisation (WMO) have been working toward improving the FAIRness and consistency of DRR data, we describe their efforts and outline their lessons learned and recommendations. Our next deliverable, which discusses metadata standards, controlled vocabularies, and ontologies, will add to this discussion. 

While the current report focuses entirely on the DRR research area, DRR research is interdisciplinary by nature, encompassing researchers from earth sciences, climate change and environmental sciences, social studies, cultural information, and others. A key recommendation from the UNDRR is that there should be interdisciplinary collaboration when setting standards and definitions; therefore, increasing FAIRness in DRR has the potential to increase FAIRness across many related disciplines. 

The study found that the data used by Fiji and Sudan for DRR is missing many key FAIR data elements. Hazard data tended to score highest for FAIRness, particularly hazard data originating from satellites. In contrast, vulnerability and exposure data were the least FAIR with little metadata and limited machine readability. However, there are some excellent regional and global initiatives aimed at increasing the level of FAIRness in DRR data. The UNDRR is currently reinventing its DRR database to provide a much more coherent and consistent view of the state of DRR both globally and nationally. We applaud this project and believe that significant effort should be made by the global and regional agencies to work together to provide standards, controlled vocabularies, data distribution platforms, resources and guidance for all people working to reduce the impact of disasters.

Disaster Risk Reduction Domain-specific FAIR vocabularies

Bolland, Jill; Shanker, Neeraj; Reinen-Hamill, Richard; Fakhruddin, Bapon

Disasters are inherently complex with wide-ranging and cascading impacts. The exponential growth in data generated daily, coupled with the complex nature of disasters, means we are hitting the limits of humans’ capacity to fully exploit all the data available for disaster risk reduction (DRR). This can be addressed with well-designed, pretrained Artificial Intelligence (AI) algorithms that can analyse large, complex datasets and fuse heterogeneous data. However, machine-readable, semantically linked data is a precursor for the use of AI in DRR. 

Nations possessing ample resources and technical proficiency are better positioned to leverage DRR data effectively, thereby potentially creating disparities in the accessibility and application of DRR data. Recent advances in technology – particularly remote sensing data, which is income-agnostic and provides global coverage – provide an opportunity to reduce DRR data gaps. Global DRR institutions should collaborate proactively with countries and regional institutions to ensure the provision of Findable, Accessible, Interoperable, and Reusable (FAIR) and open DRR data. This could help bridge any historical or emergent DRR data inequalities.

This deliverable explores the use of vocabularies in the DRR domain and how controlled vocabularies coupled with ontologies can enhance the semantic value of DRR data thereby improving interoperability. Enhancing semantic interoperability would result in improved collaboration and communication within the DRR domain and facilitate collaborations with other scientific domains. The final sections of this report provide examples of the use of remote sensing data and AI for DRR. We hope that the ideas and suggested actions in this report can be used to transform raw DRR data to valuable insights and decisions that produce tangible reductions in the impact of disasters worldwide.

Cultural Heritage (WP13)

Cultural Heritage Mapping Report: Practices and Policies supporting Cultural Heritage image sharing platforms

Knazook, Beth; Murphy, Joan

Deliverable 13.1 Cultural Heritage Mapping Report: Practices and Policies supporting Cultural Heritage image sharing platforms outlines current practices guiding online digital image sharing by institutions charged with providing care and access to cultural memory, in order to identify how these practices may be adapted to promote and support the FAIR principles for data sharing. It looks closely at the policies and best practices endorsed by a range of professional bodies and institutions representative of Galleries, Libraries, Archives and Museums (the ‘GLAMs’) which facilitate the acquisition and delivery, discovery, description, digitisation standards and preservation of digital image collections. The second half of the report further highlights the technical mechanisms for aggregating and exchanging images that have already produced a high degree of image interoperability in the sector with a survey of six national and international image sharing platforms: DigitalNZDigital Public Library of America (DPLA), EuropeanaWikimedia CommonsInternet Archive and Flickr. This report will be a valuable resource in producing recommendations for aligning existing professional practice in the sector with the FAIR principles – a key milestone for the case study.

The report concludes with some thoughts on the position of the Digital Repository of Ireland (DRI) as an image sharing platform within this landscape, as a stewarding repository for both cultural heritage organisations in Ireland seeking to preserve and make accessible their collections as well as research projects curating, examining, preparing and delivering cultural heritage data for reuse. At the end of the WorldFAIR project, the DRI will aim to have tested and implemented recommendations that align established collections delivery mechanisms to facilitate the use of cultural heritage images as research data, improving the findability, accessibility, interoperability and reusability of Ireland’s visual cultural memory.

Cultural Heritage image sharing recommendations report

Knazook, Beth; Murphy, Joan

Deliverable 13.2 for the WorldFAIR Project’s Cultural Heritage Work Package (WP13). Although the cultural heritage sector has only recently begun to think of traditional gallery, library, archival and museum (‘GLAM’) collections as data, long established practices guiding the management and sharing of information resources has aligned the domain well with the FAIR principles for research data, evidenced in complementary workflows and standards that support discovery, access, reuse, and persistence. As explored in the previous report by Work Package 13 for the WorldFAIR Project, D13.1 Practices and policies supporting cultural heritage image sharing platforms, memory institutions are in an important position to influence cross-domain data sharing practices and raise critical questions about why and how those practices are implemented.

Deliverable 13.2 aims to build on our understanding of what it means to support FAIR in the sharing of image data derived from GLAM collections. This report looks at previous efforts by the sector towards FAIR alignment and presents 5 recommendations designed to be implemented and tested at the DRI that are also broadly applicable to the work of the GLAMs. The recommendations are ultimately a roadmap for the Digital Repository of Ireland (DRI) to follow in improving repository services, as well as a call for continued dialogue around ‘what is FAIR?’ within the cultural heritage research data landscape.