Zenodo’s next generation platform - InvenioRDM

by Lars Holm Nielsen, on December 7, 2022


We're excited to share our plans for Zenodo in the coming 6-12 months.

TL;DR - See our demo site for Zenodo's next-generation platform based on InvenioRDM.

Next-generation platform

We are currently in the process of moving Zenodo on top of our next-generation platform based on InvenioRDM. InvenioRDM is a digital repository platform born out of Zenodo and developed together with 25 other partners.

For institutions/domains, InvenioRDM enables them to provide the "Zenodo-experience" in their own repositories. For Zenodo, InvenioRDM enables us to collaborate with partners on the technical platform underpinning Zenodo.

InvenioRDM is already being used in production system today, see for instance:

The ZenodoRDM project

The move to InvenioRDM is being managed and coordinated through our ZenodoRDM project. The focus of the project is to migrate Zenodo.org with the existing feature set to InvenioRDM and provide backward compatibility to our users.

The largest challenge in the project is by far the full data migration process. Zenodo today has almost 3 million records, 300k users and many API integrations, and thus we're doing our outmost to minimize disruptions during the data migration process and ensure existing API integrations will continue to work unaffected.

Milestones

Our project is designed to start early on with the data migration, and continue it as a main activity throughout the entire project. Thus, a large fraction of each milestone is dedicated to data migration.

In addition for each milestone, we'll work with specific users to validate that everything works as supposed.

The milestones for the project

  • MS2 Themed demo-site with initial data migration
  • MS3 Legacy API support
  • MS4 Basic feature set completion
  • MS5 Normal feature set completion
  • MS6 Advanced feature set completion
  • MS7 Support operations
  • MS8 Launch

Timeline

We are expecting to go-live with the new platform in Autumn 2023.

What's new

The move to InvenioRDM will also provide some long awaited new features for Zenodo. See below what's coming, or even better, check out the new features for yourself on our demo site.

State of demo site

We have loaded a partial dump of 1M records and a couple of communities from Zenodo into the demo site. The records are not connected with user accounts and thus cannot be edited. You can create new communities and test the new upload form and get a feeling for some of the changes we've made.

Community members

InvenioRDM supports having members with different roles - that means multiple users can curate records and/or see closed access content in the communtiy. Zenodo will come with the following community roles:

  • Reader: A reader is a member of the community and can view restricted files owned by the community.
  • Curator: A curator can in addition to a reader also edit/accept/decline records in a community.
  • Manager: A manager can in addition to curators also manage members of a community.
  • Owner: An owner has full administrative access to a community, and can change all settings as well as delete the community.

More about communities

Reviews

Submission to communities now enable the curator and uploader to have a conversation directly on the platform. Curators of the community will receive the review request, and can have a conversation with the submitter, as well as preview the submitted record. Both the submitter and curator can edit and update the record under review until it's published:

More about review

Upload form

Our upload form have also gotten a larger overhaul. You'll for instance find the following changes:

FAQ

Will your APIs be backward compatibile?

Yes, we do our outmost to ensure backward compatibility of our APIs, and your integration will continue to work on the new platform as well.

Will I be required to update my API integration ?

After Zenodo on InvenioRDM has been launched (Autumn 2023), we will deprecate some of our existing APIs. We will provide a migration period of 1-year for the transition. New features, will only be available on the new API.

Where do I find documentation of your new REST API?

You can find the documentation for our new REST API in the InvenioRDM documentation.

Will the community curators be able to edit my existing records?

No. You will have to opt-in to allow community curators to edit the metadata of records already uploaded on Zenodo. Metadata of new records you upload after the launch will by default be editable by curators (and you'll be asked to confirm).

Will feature X still work?

Yes, we will provide backward compatiblity for all existing features.



Northwestern, CERN Join NIH in Enhancing Access to Biomedical Research

by Northwestern University, CERN and NIH, on September 30, 2022


Cross posted from Northwestern

A new award to Northwestern University Feinberg School of Medicine and the European Organization for Nuclear Research (CERN) will enhance capabilities of data management and sharing for National Institutes of Health-funded researchers through the Generalist Repository Ecosystem Initiative (GREI), led by the NIH Office of Data Science Strategy.

This modernization of the data ecosystem aligns with the NIH Strategic Plan for Data Science and includes search and discovery of NIH-funded data in generalist repositories. The GREI establishes a common set of cohesive and consistent capabilities, services, metrics, and social infrastructure across repositories, and facilitates the adoption of FAIR principles to better share and reuse data.

Zenodo joins the GREI through a partnership between Northwestern University and CERN, led by Kristi Holmes, PhD, director of Galter Health Sciences Library and Learning Center and professor of Preventive Medicine in the Division of Health and Biomedical Informatics, and Tim Smith, PhD, head of IT Communication, Education and Outreach at CERN. The Zenodo GREI team features expertise and leadership from both sites, including Jose Benito Gonzalez Lopez, PhD, head of Institutional Repositories at CERN; Lars Holm Nielsen, InvenioRDM product manager at CERN; Matthew Carson, PhD, senior data scientist and head of Digital Systems at Galter Library; and Sara Gonzales, senior data librarian at Galter Library and community manager for InvenioRDM. Additional team members will be recruited in the coming months.

Since its launch almost 10 years ago, Zenodo has served as an open, dependable home for science, enabling researchers to share and preserve a wide range of interdisciplinary research outputs. Zenodo was established through the European Commission OpenAIRE program and is operated by CERN. Zenodo houses over 2 million records and a petabyte of data, serving 15 million user visits from around the world annually.

Over the past several years, CERN and Northwestern have partnered with the Invenio Open Source Community (IOSC) to develop InvenioRDM, a turnkey, scalable, and top-of-the-class user experience software for repositories, forming a strong and sustainable foundation for Zenodo. The InvenioRDM software is dedicated to offering a reliable environment for science, empowering preservation, credit, discovery, and sharing while maintaining integrity in its responsiveness to the evolving needs of the research community, including data sharing policy compliance.

“Our strong and efficient partnership with Northwestern through the InvenioRDM project has shown how effective we can be with our complementary skills and common goals,” Smith said. “The GREI allows us to take this partnership to the next level in delivering a useful service to NIH-funded researchers. We are excited that the NIH is supporting us and entrusting us with this task.”

The NIH Office of Data Science Strategy, formed in 2018 within the Division of Program Coordination, Planning, and Strategic Initiatives (DPCPSI), leads implementation of the NIH Strategic Plan for Data Science through scientific, technical, and operational collaboration with the institutes, centers, and offices that comprise NIH. DPCPSI also plans and coordinates the NIH Common Fund’s support of trans-NIH initiatives and research.

“Modern research requires collaboration and thoughtful, feature-rich technology for success,” Holmes said. “We’re thrilled to build on our longstanding partnership with CERN to advance our shared commitment to FAIR practices and we look forward to working together and with the GREI partners to achieve the goals of the program.”

The Zenodo GREI project is supported by the NIH Office of Data Science Strategy/Office of the NIH Director pursuant to OTA-21-009, “Generalist Repository Ecosystem Initiative (GREI)” through Other Transactions Agreement (OTA) Number 1 OT2 DB000013-01.

Situated within the Northwestern University Clinical and Translational Sciences (NUCATS) Institute, Galter Library is the only library embedded within a CTSA hub. NUCATS is supported, in part, by the National Institutes of Health’s National Center for Advancing Translational Sciences, Grant Number UL1TR001422.



Fighting spam - safe listing

by Lars Holm Nielsen, on July 13, 2022


Zenodo's vision is to enable researchers around the world to share and preserve any research output from any discipline via a seamless user experience. The same features that make it easy for any researcher to share and preserve their research, as a side effect also make it easy for spammers to misuse our service.

As Zenodo grew in popularity, our spam problem grew as well. We firmly believe in the need to make sharing and preserving research data as easy as possible, and thus we have always opted against introducing factors blocking researchers' ability to share and preserve their research instantly. So far, we have been fighting spammers with automated classification systems and manual reviews, yet with every counter-measure we've taken, spammers have adapted their methods.

Today, we're introducing yet another counter-measure to fight spammers. Content from new users will, as of today, be ranked below content from safelisted users. This means that spam will be less visible in all search results, allowing our automated classification system more time to catch the spam. In addition, we will be introducing a human review of all new users uploading content to Zenodo that will allow us to safelist new users and catch spammers. The human review is in progress of being introduced as part of our support operations, and we will also go through the backlog of existing users to safelist them. We have seeded the initial safelist with all users who logged in via ORCiD and GitHub, and users with existing uploads accepted in communities.

All in all, if you're an existing Zenodo user and didn't login via ORCiD or GitHub, or had any of your uploads accepted in a community, your records will appear at the bottom of search results. We will be working on safelisting all existing users as fast as possible.

If you're a new Zenodo user, your uploads will also appear at the bottom of search results until our manual review has safelisted you. We plan safelisting new users at least once a day during business days, but until we have worked through the backlog of existing users there might be a longer delay.

The new feature in no way limits you ability to share and preserve research results. You can still upload your data, software and publications to Zenodo, and get a DOI instantly. The new measures only make sure that spam that makes it past our automated classification system is much less visible in search results, until a human review can catch the spammer.