Google Summer of Code and Zenodo summer update

by Krzysztof Nowak on July 31, 2017


Google Summer of Code 2017

Zenodo has been taking part in Google Summer of Code 2017 and since June two students, Aman Jain and Xiao Meng, have been working with our team on introducing two exciting features to Zenodo by the end of the summer.

Aman Jain's project will introduce public user profiles on Zenodo that will allow our users to share and show case their uploads on Zenodo.

Xiao Meng's project introduces a new backend files processing module, which will enable us to e.g. extract the metadata from the PDF documents and use that information to improve search or easy pre-filling of upload forms.

New feature: support contact form

Thanks to Aman Jain, one of our GSoC students, you can now contact us more efficiently through a contact form available at zenodo.org/support. This will allow us to organize and resolve our support requests better and faster.

New feature: logged in devices

We recently updated Zenodo to the latest Invenio version which brought along a new feature that allow users to view all devices currently logged into their account. This security feature allow you to remotely log out of devices in case you e.g. forgot to logout of your Zenodo account on a public computer.

You can view all currently logged-in devices by navigating to the Security tab in your account settings.

Active sessions

New feature: status page

On the footer of all Zenodo pages you will now find a "Status" hyperlink (status.zenodo.org), which will show you status of Zenodo and uptime statistics for Zenodo pages and services.



Upload storage incident

by Lars Holm Nielsen on July 19, 2017


What happened?

As a result of a regular automatic file integrity check, as well as some user reports, we have discovered that 18 files uploaded to Zenodo after June 21st this year were not stored successfully. Despite serious efforts we have not been able to recover any of these 18 files from the CERN storage servers.

How did it happen?

We are taking this incident very seriously and have thoroughly investigated what happened. The root cause was the coincidence of two software bugs; one bug was found in the underlying disk storage system and the other bug was found in the client software that our web servers uses to connect to the disk storage system. The two bugs were activated on June 21st when our underlying CERN disk storage system was upgraded to a new major software release. Only recent files uploaded on or after June 21st could have been affected, and of those, only 18 out of the 15,000 files uploaded to Zenodo since June 21st were actually affected.

An in-depth explanation of the incident is provided below.

Is it fixed?

Yes. We have already deployed fixes for the two software bugs. We have also taken further measures to ensure similar issues cannot happen. Even though it was good that our file integrity checks caught the errors, we have taken steps to improve this monitoring and ensure that we are alerted immediately in the future.

Is my file affected?

We have personally contacted all affected users by email, and since only a tiny fraction of recently uploaded files were affected we are hoping to recover all files from their respective uploaders.

Why could you not recover the files?

The reason we could not recover any of the files was because the files was never stored on our storage system, and thus our backups did also not have the file (see in-depth explanation below). The information we do have is metadata such as the file size and file fingerprint (MD5 checksum) as these a calculated on the web server side. This information allows us to check if files recovered from the respective uploaders is indeed the exact same files.

What measures are you taking to prevent this in the future?

We are operating complex systems with tens of terabytes of data and millions of files, and we anticipate failures to inevitably happen. That's also why we go to a great deal of length to safeguard files that users upload on Zenodo. In this case, one of our many checks also caught the problem, however with a delay of three weeks instead of immediately. We have now measures in place that ensures we catch a similar problem right away, and will continue to proactively anticipate other types of failures and build countermeasures against them as part of our preservation strategy.

In-depth explanation of the incident

When a user uploads a file to Zenodo, the file is streamed through one of our web servers down to a storage server in our disk storage system. The disk storage system then immediately replicates the file to another storage server in the cluster before sending back a response to the web server that the file was successfully written to disk. On a successful write, the web server will then record metadata about the file in our database and let the user know the file was successfully uploaded.

One of the software bugs affected the underlying client library that Zenodo uses to connect to the storage system. After a complete file was sent from the web server to the storage system, the client library did not properly check the final reply from the storage system for errors. This meant that some particular errors reported by the storage system would not be caught by the client library and lead the web server to think that the file was written successfully to disk when in fact there was an error.

The other software bug was found in the new version of the disk storage system software. Once the storage server had received the entire file it would try to replicate the file to another storage server in the cluster. If this other storage server was unresponsive (e.g. due to high workload or network congestion), the replication operation would timeout. The storage server would then proceed to cleanup the file (i.e. delete it) and send back an error reply.

Thus, when a file replication operation failed in the storage system, the client library did not catch that there had been an error, leading the web server to think the file was successfully written to disk when in fact the storage system had never stored the file. This error did not expose itself prior to June 21st, because the previous software version on the disk storage system would automatically recover from the replication failure and not send an error reply back. As a result of this incident, the disk storage system software will reinstantiate the previous behaviour and try to immediately recover from the replication failure.



Zenodo now supports DOI versioning!

by Lars Holm Nielsen on May 30, 2017


We are pleased to announce the launch of DOI versioning support in Zenodo - the open research repository from OpenAIRE and CERN. This new feature enables users to update the record’s files after they have been made public and researchers to easily cite either specific versions of a record or to cite, via a top-level DOI, all the versions of a record.

DOI versioning support was one of our most requested features for Zenodo, and it has been co-developed by OpenAIRE’s Zenodo team and EUDAT’s B2SHARE team as an extension module for CERN’s Invenio digital repository platform, which powers both Zenodo and B2SHARE.

This update comes hot on the heels of the recent relaunch which made Zenodo faster, improved GitHub integration, integrated support for Horizon 2020 grant information, and enabled 50 gigabyte uploads!

Read more about the inner workings of new feature in the DOI Versioning FAQ.

DOI versioning for Zenodo