The International Data Week (IDW) 2023 took place on 23–26 October 2023, in Salzburg, Austria. IDW 2023 was hosted by the University of Salzburg through its interdisciplinary Data Science group and the Geoinformatics department, supported by the Governor of Salzburg and with assistance from the Austrian Academy of Sciences – GIScience and the European Umbrella Organization for Geographic Information.
The event brought together a global community of data scientists and data stewards; researchers from all domains; data, interoperability and informatics experts from all fields, industry leaders, entrepreneurs and policymakers.
IDW 2023 combined the RDA 21st Plenary Meeting, the biannual meeting of this international member organisation working to develop and support global infrastructure facilitating data sharing and reuse, and SciDataCon 2023, the scientific conference addressing the frontiers of data in research organised by CODATA and WDS.
The WorldFAIR Project: A thread running through the IDW2023 programme
The WorldFAIR coordination and case study teams participated in over 10 sessions throughout the week, as we previously outlined in this blog. From organising sessions dedicated to the work underway by the project, to participating and presenting on behalf of the project to various RDA and SciDataCon sessions, the WorldFAIR project had a vibrant and distinct presence throughout the IDW2023 week.
The project was also present with an exhibition stand and a poster, leveraging the networking opportunities an international event like IDW has to offer.
The WorldFAIR Project at the IDW2023 poster (left) and exhibition (right) areas.
This report provides a summary and highlights of the project’s work at IDW2023.
The WorldFAIR Project: the methodology, the case studies and the Cross-Domain Interoperability Framework
1. FAIR Beyond Discoverability: Exploring Technology Approaches and Challenges through DDI-CDI Implementation
Summary by Simon Hodson
This session presented the Data Documentation Initiative’s new Cross-Domain Integration standard. DDI-CDI is designed to assist with combining data across domains with different sources and data structures. To do this it comprises three modules: data description (through the variable cascade), structural description and provenance and processing. DDI-CDI forms an important role in the WorldFAIR project and the Cross-Domain Interoperability Framework and is designed to work with many other standards.
The session featured specific use cases and implementations: one, from EOSC Future Science Project 9, that combines social survey data with climate and air quality data; and another from UKDA where granular data descriptions from CDI are important in helping manage fine-grained access to potentially disclosive data. The discussion revolved around clarifying the potential uses of DDI-CDI and identifying a number of possible partnerships for further testing and implementation.
2. WorldFAIR: the Cross Domain Interoperability Framework (CDIF)
Summary by Simon Hodson
What is the Cross-Domain Interoperability Framework? In this session we found out what it is and what it is not! It is not a new metadata standard; it is not the 15th standard in the famous XKCD cartoon; and it is certainly not one overarching mega standard, one ring to rule them all! What it is is a set of recommendations for how to meet a set of functional requirements to implement FAIR, paying particular attention to machine actionability. It is a framework of existing and emerging standards that can provide cross-domain solutions. The draft CDIF will be an output of the WorldFAIR project in May 2024. It builds on the work of the WorldFAIR Case Studies and their FAIR Implementation Profiles; it is being developed by the CDIF Working Group and has benefited from a sprint at a recent Dagstuhl Workshop. This session presented CDIF in the context of WorldFAIR and the eleven case studies; outlined the functional areas being addressed and the emerging recommendations; and showcased two strands of work from the recent workshop: one on data access and one on data integration and semantic mappings. The latter presentation, from Yann Le Franc, showed how semantic mappings and the DDI-CDI standard can be used together for data integration! This highlights interconnections and collaborations between activities in CODATA, DDI, and RDA; and across European projects FAIR Impact and WorldFAIR. Questions covered a range of topics including how to use ODRL in the context of queries and how to deal with uncertainty and error when combining datasets. At least half the audience indicated they were working on cross-domain projects. There is no shortage of potential Case Studies and partners as CODATA seeks to expand and sustain the WorldFAIR activities!
The WorldFAIR Project addressing global challenges
1. IDW 2023 Plenary Session ‘Data and global challenges: data, science, trust and policy’
The major global human, societal, and scientific challenges of our age are fundamentally interdisciplinary and related to all sectors of society. These challenges can only be addressed through the close collaboration of science, civil society, and government using cross-domain and multi-stakeholder research that seeks to understand complex systems, including through machine-assisted analysis at scale.
In this session, Pier Luigi Buttigieg, WorldFAIR case study lead for Ocean Science and Sustainable Development, presented on ‘The UN Ocean Decade & the Ocean Data and Information System’, touching on his work carried out under the WorldFAIR project.
Pier Luigi spoke about building a digital federation for the UN Ocean Decade and its core principles, ‘embracing pluralism within a framework of multilateralism’. The speaker stressed the importance of the ‘plumbing’, the ‘triumph of civilization’ which is what holds our cities together, and something not many people are willing to do (as opposed to wanting ‘the sparkly fountain’, i.e., the fancy app, interface, etc.). Pier Luigi noted that focusing on the ‘plumbing’ is the key principle of his work. He also discussed the need for interoperability agreements in order to build and run sustainable digital architectures. He touched on the Ocean Decade Implementation Plan, intended to address the fact that the ocean is an underreported area in terms of how nations have been connecting science to the SDGs, because of the lack of a global system for observing and understanding the ocean, and mediating collective action. In summary, Pier Luigi notes that we need to move into a mode of sustainable ocean management; this requires a number of elements to come together: science operations, governance, policy and the public, and this framework intended to guide this process.
Pier Luigi discussed the Ocean InfoHub (OIH) project, aiming to improve equitable access to global ocean information, metadata and knowledge products for science and sustainable development, and the Ocean Data and Information System (ODIS), a sustainable federation of independent partners, which will continue beyond the project. This work is positioned among other stakeholders to ensure interoperability, with the WorldFAIR project among them. The oceans environment is multi-domain and multidisciplinary. Within the WorldFAIR and the Cross-Domain Interoperability Framework (CDIF), the specifications are being moved out of ODIS and into the CDIF, aiming to help bring more domains together. Within the project, the WorldFAIR case study has identified a set of priorities for building cross-domain interoperability, namely Biodiversity, hazards and disasters, chemicals and pollutants, and cultural heritage and knowledge. The aim is to move towards an integration-on-demand data space.
2. Inclusivity in Open Science while advancing research assessment and career pathway impact
Summary by IDW2023 Communications Committee
This plenary session addressed the wide-ranging topic of “Inclusivity in Open Science while advancing research assessment and career pathway impact”. The session featured four speakers – Abel L. Packer, Maui Hudson, Ginny Barbour and Ana Ortigoza (WorldFAIR WP Lead for Urban Health). The speakers discussed the international aim of supporting more open and transparent research practices to achieve more trustworthy and inclusive research and science, and most importantly to be able to have more positive impacts on society.
Ana’s talk focused on ‘Data for Equity: the case of Latin America’, and addressed important questions such as what are the needs of minority and indigenous groups, and what needs to be addressed, and who can and should support this change.
Ana started her presentation by noting that ethnic identification in country’s census questionnaires has grown over the years, but there are challenges in interpreting and comparing these results, as collection of data has to be consistent throughout the countries. Additionally, gender self-identification is only legally recognized in 4 countries (i.e., not inclusive). So what can be done? Ana stressed that data should be available, and comprehensible. As to who should support this initiative, the WP8 lead noted that this is a multidisciplinary topic; evidence and research can be advanced by generating and having meaningful data on ethics and gender.
The key messages from Ana Ortigoza’s presentation are:
- Disaggregated health data by ethnic and gender diversity dimensions is key for advancing health equity in the Americas,
- Challenges for this goal are throughout the data- cycle and should be approached in a comprehensive way,
- this is a Window of opportunity for FAIR and CARE data implementation, and Supported in transdisciplinary work and across regions.
The WorldFAIR work package on Urban Health is working on recommendations that reflect the FAIR and CARE principles, and on overall promoting best practices in data sharing in Urban Health and beyond.
All speakers’ presentations, including Ana’s, can be found here.
RDA Plenary, SciDataCon and the WorldFAIR Project come together: (Cross)Disciplinary perspectives
1. Why aren’t we talking about Collections as Data?
Summary by Beth Knazook (WP13, Cultural Heritage)
This session introduced participants to the emerging discourse around Collections as Data through a series of presentations from professionals engaged in promoting FAIR and computational reuse of (primarily, but not limited to) cultural heritage collections in libraries, archives, and museums, with the goal of creating an RDA Interest Group to sustain the conversation. Co-chairs Beth Knazook and Thomas Padilla presented the European-funded WorldFAIR Project Case Study on Cultural Heritage Image Sharing followed by the two originating Mellon-funded projects, “Always Already Computational” and “Collections as Data: Part to Whole.” There were four additional presentations showcasing the range of emerging activity in the field, from institutional and infrastructure developments to support computational reuse of collections to engagement activities promoting and demonstrating actual reuse of collections as data. Presenters included: Sally Chambers, DARIAH-EU; Sharif Islam, DiSSCo; Bethany Anderson, University of Illinois, Urbana-Champaign and Riitta Peltonen, National Library of Finland.
Key takeaways, lessons and actions:
- There is considerable interest in the topic and enough enthusiasm to launch an Interest Group. Co-chairs will convene a meeting with participants following the Plenary to discuss key actions for the group.
- Digital preservation needs to be a bigger part of the conversation and there may be opportunities for an RDA group to liaise with existing preservation groups/communities.
- We need to move forward with an inclusive mandate for ‘digital heritage’ FAIR data instead of ‘cultural’ vs ‘natural’ heritage. Collections cover a wide range of disciplinary and cultural knowledges.
2. Describing Chemical, Physical and Biological samples digitally: Seeking harmonization
This session, co-organized by multiple work packages (WP02, WP03, WP04, WP05, WP10), in addition to PaN and CRDIG, brought together a variety of disciplines including geochemistry, biodiversity, nanomaterials, analytical chemistry, and crystallography, among others, to explore approaches to harmonization around sample description and provenance. The first half of the session focused on sharing perspectives and identifying what the wider community needs are, what may already exist to help address these and what further action is required. The meeting featured a number of WorldFAIR presentations: Kerstin Lehnert (IGSN), Debora Pignatari Drucker (WP10/IGAD CoP), Alexander Prent (WP05), Iseult Lynch (WP04), Leah Mcewen (WP03/CRDIG), and Arofan Gregory (WP02).
The discussion included the topics of needs for describing sample types, origin, processing workflows and other requirements across disciplines; existing identifiers, classifications, ontologies and terminologies that support these descriptions, and ultimately, exploring whether there is interest for an RDA Working Group to develop best practices for sample data model specifications. On the latter, it was agreed that this an important interdisciplinary topic, with the potential for an RDA WG on samples to work on the following issues:
- What are the axes of metadata around samples?
- What is already being used across the communities?
The session organisers have agreed to propose a general way forward to deal with sample data/metadata/provenance.
3. Let’s talk about FAIR mappings! Towards common practices for sharing mappings and crosswalks
This RDA P21 BoF session explored the growing number of mappings that are continuously developed in various groups, both within and outwith the RDA Community.
To reach the full potential of these mappings for interoperability, it is necessary to share them in a machine actionable format and in essence to make them FAIR. The session invited various RDA Interest and Working Groups working with mappings to give short presentations to explain their approach and usage to feed the discussions, also exploring next steps for the community.
Iseult Lynch from WorldFAIR gave the Nanomaterials perspective with her presentation on ‘Describing and representing (nano)materials: mapping shape representations’.
4. Beyond FAIR: Reusing Chemical Data Across-disciplines with CARE, TRUST, and Openness.
This session was co-organized by (WP03, WP04, WP05) in addition to PARC. The session aimed to look beyond FAIR data from reusable to reused. Through real-world research projects, the meeting presented how FAIR data can be integrated across disciplines to enable innovative analysis. What opportunities were realised by having access to FAIR data? What challenges continue to arise in working with heterogeneous data? How usable are chemistry data in different disciplinary contexts?
The presenters explored use cases of interoperability around chemistry data in the broader environment and the range of issues that may arise in applying chemical notation to nanomaterials, therapeutics, geochemistry, atmospheric research and beyond, concentrating on how data concepts such as FAIR, CARE, TRUST, openness and synergies among them can maximize data applicability to timely real world problems.
5. Open data and open service for disaster risk reduction
This session brought together leading experts to review the progress in resilient data for DRR in the context of open science and to share their experience, insights and suggestions to advance this new, cross-sectoral endeavour. Part of the speaker line-up was Bapon Fakhruddin from the WorldFAIR case study team on Disaster Risk Reduction. Bapon presented on ‘A regional showcase in support of GOSC SDG-13 CS’.