Zenodo launches integration with Software Heritage

by Lars Holm Nielsen, on October 21, 2024


Zenodo and Software Heritage, through the EU-funded FAIRCORE4EOSC project, have launched a new integration. In order to fulfill the promise of an interconnected and interoperable academic ecosystem, research software infrastructures should support the archiving of source code within the universal source code archive, contributing to the global software commons. This integration ensures that software source code deposited in Zenodo is automatically archived in Software Heritage. It implements the recommendations from the EOSC Scholarly Infrastructures for Research Software report:

“In the 21st century, many research activities use computing systems to monitor their experiments, to visualise or analyse their results, or to check hypotheses through simulation.It has therefore become essential to archive, preserve and share research software.”

“Over the past decade, awareness has been raised about the importance of software in the scholarly world. Several infrastructures have started to be built, or adapted, to address some of the following key challenges that need to be tackled to put software on equal footing with other research outputs in the scholarly world:

  • Archiving software to ensure research software artifacts are not lost.
  • Referencing software to ensure research artifacts can be precisely identified.
  • Describing software to easily discover and identify research software artifacts.
  • Crediting all authors to ensure their contributions are recognized.”

Zenodo: Research software + Versioning + GitHub

Zenodo has long had a strong focus on supporting research software. Since 2014 Zenodo has an integration with GitHub that enables researchers to easily archive research software in GitHub into Zenodo. Upon deposit of the research software in Zenodo (either from GitHub or directly in Zenodo), the researcher would obtain a DOI (Digital Object Identifier) which would facilitate the persistent identification of software and support researchers in adopting the Software Citation Principles, in particular in citing research software papers. The Zenodo versioning feature further enabled both the citation of individual snapshots of software vs. citing a software project as a whole. Today, Zenodo is the largest minter of software DOIs and is able to track citations to software independently of which persistent identifier was used in the citation.

Integration with Software Heritage

The new integration between Zenodo and Software Heritage enhances the capabilities to archive, reference, describe, and cite research software artifacts. Most of the process occurs behind the scenes, ensuring seamless and transparent software archiving for researchers, regardless of their workflow.


Figure 1 - Zenodo record for a software deposit showing that it has been archived in Software Heritage in the bottom right corner.

When a researcher deposits software in Zenodo, the software will be automatically sent to Software Heritage (if the files are publicly accessible). Zenodo then obtains the associated Software Hash Identifier (SWHID), links it with the DOI, and displays it on the record landing page. The DOI integrates with the scholarly publishing ecosystem, while the SWHID provides direct access to the archived source code, including the full version history. This bi-directional linking ensures interoperability between two key identifiers for research software.


Figure 2 - The corresponding Zenodo software record in Software Heritage.

In addition to archiving software in Software Heritage, Zenodo has enhanced the upload form with software-specific fields, such as programming languages and repository URLs on top of our already existing fields such as the SPDX license field. We’ve also added support for CodeMeta and Citation File Format export formats.

What’s next?

While the core integration with Software Heritage has launched, further backend improvements are planned for the coming six months, primarily aimed at improving interoperability. Additionally, this integration will be fully incorporated into InvenioRDM, making it easier for other repositories, such as institutional ones, to integrate with Software Heritage.

This work was funded by the European Commission through grant agreement no. 101057264 (FAIRCORE4EOSC).



Win the 2024 Dataworks! $100,000 Grand Prize by reusing Zenodo data

by Pearl D. Go, on October 14, 2024


Announcing the 2024 DataWorks! Prize!

The Federation of American Societies for Experimental Biology (FASEB) and the National Institutes of Health (NIH) invite you to submit your data reuse project proposal that demonstrates the power of data reuse to advance human health.

The 2024 DataWorks! Prize is a collaboration with the seven generalist repositories participating in the NIH-funded Generalist Repositories Ecosystem Initiative (GREI) and will focus on best practices in data reuse and secondary analysis that advance human health. Participants will participate in a two-phase challenge.

  • Phase 1: Research teams will submit a proposal for a secondary analysis research project that can be completed within a 6 month period and incorporates data from one or more generalist repositories participating in the GREI (more information on GREI); data from other repositories can be combined.
  • Phase 2: Selected teams will complete their reuse/secondary analysis research projects and share their findings publicly.

Awards and Prizes

The NIH Office of Data Science Strategy will award up to $500,000 total in cash prizes to the Challenge winners. NIH will award the prize purse in the following amounts:

  • Phase 1: $25,000 per winner, up to ten (10) winners
  • Phase 2: Grand Prize: $100,000 for one (1) winner; Distinguished Achievement Awards: $75,000 per winner, up to two (2) winners

Successful submissions must:

  • Address a pivotal health research question via data reuse and secondary data analysis
  • Include data from at least one GREI repository, including Zenodo.
  • Share results with the broader community
  • Be submitted by October 24, 2024

Reusing data in Zenodo:

As a cross-domain repository, Zenodo enables researchers to share and preserve a wide range of interdisciplinary research outputs, including research papers, data sets, research software, reports, presentations, and any other research related digital outputs.

How can I find Zenodo data for reuse?

Search for data by simply typing keywords of interest in the search box at the top of the Zenodo home page. Facet search results by using the menu on the left side of the page by resource type (e.g., dataset), access status (e.g., open), subject area, and file type (for example, hypertension, pathogen, comparative genomics, precision medicine). You can order the search results based on date, best match, most viewed, etc. Use the Zenodo Search Guide to further craft your search.

You can also try your search in Zenodo Communities, A Zenodo community provides a space for domains, projects, and institutions to curate and manage a collection of their research outputs and share with members of the community and beyond (for example, some Communities funded by NIH).

Full details about the challenge can be found here.

We encourage you and your team to submit your project proposals. Your ideas could shape the future of healthcare.



EU Open Research Repository

by Lars Holm Nielsen, on March 20, 2024


The European Commission and CERN are today launching a pilot of the new EU Open Research Repository that welcomes research outputs (data, software, posters, presentation, project deliverables) stemming from one of EU’s research and innovation funding programmes such as Horizon Europe and Euratom.

Supporting EU Open Science policy

The EU has increasingly supported the implementation of the Open Science policy in successive Research and Innovation Framework Programmes, starting from the Open Access pilot in FP7, adding Open Data provisions in Horizon 2020, and laying down in Horizon Europe a set of provisions for open science practices such as open access to scientific publications, open access to research data and responsible management of research data, notably through the mainstreaming of data management plans and in line with the FAIR principles. This evolution has been accompanied by several EU-funded actions to support beneficiaries to better manage their research outputs and to facilitate the implementation of the programme provisions. This has notably included support to the creation of Zenodo, a general-purpose open repository operated by CERN, allowing researchers to deposit research papers, data sets, research software, reports, and any other research related digital outputs.

The new EU Open Research Repository, a Zenodo-community, capitalizes on past investments made in Zenodo and helps EU programme beneficiaries comply with the new FAIR and open science requirements, by implementing an easy go-to solution in Zenodo for beneficiaries to make data FAIR in practice. The repository is managed by CERN on behalf of the European Commission.

Pilot

Currently in its pilot phase and set to be fully operational during autumn 2024, the EU Open Research Repository is constantly evolving. Efforts are committed to integrating cutting-edge features, including assisted curation and FAIR (Findable, Accessible, Interoperable, and Reusable) assistance, to further support the research community. The goal is to provide researchers with a simple goto solution for making their publicly funded research open and as FAIR as possible.

Supporting EU research projects

Today, more than 500 EU-funded research projects already have a Zenodo-community where they share their research outputs. The EU Open Research Repository will index all these and future project communities under a single umbrella and provide them with enhanced features to enable all project partners to easily share the project’s research outputs. Zenodo has since its launch more than 10 years ago supported linking research outputs with the EU grant that funded the work, which means that Zenodo today hosts more than 100,000 research outputs from more than 11,000 different grants.

Several early adopter projects from different domains (BY-COVID, FAIR-IMPACT, FAIRplus, GDI, iMagine, interTwin, RESILIENCE, SERPENTINE) are collaborating to help provide feedback on the features developed for the new repository. Other projects interested in adopting the new features can sign up for free and will be onboarded as soon as possible through out the pilot phase.

Open Research Europe (ORE)

The EU Open Research Repository serves as a complementary platform to the Open Research Europe (ORE) publishing platform. Open Research Europe focuses on providing a publishing venue for peer-reviewed articles, ensuring that research meets rigorous academic standards. The EU Open Research Repository provides a space for all the other research outputs including data sets, software, posters, and presentations that are out of scope for ORE. This holistic approach enables researchers to not only publish their findings but also share the underlying data and materials that support their work, fostering transparency and reproducibility in the scientific process.

Funding

The EU Open Research Repository is funded by the European Union under grant agreement no. 101122956 (HORIZON-ZEN).

You can learn more about the HORIZON-ZEN project on https://about.zenodo.org/projects/horizon-zen/