Linking community and technology to enable FAIR data
Access your archived research data as if it was stored on your own computer
When using a storage facility for your research data, you sometimes want to extract parts of that data or you simply want to access specific files. Several download tools are available for achieving this, all having their own functionalities and specifications. The Maastricht Study was looking for a method to access the data archived at DataHub with applications like R and Matlab, without having to download it. DataHub has performed a comparison of several methods and came to a set of recommended tools. By using one of these tools you can access files stored at DataHub as if it was a network drive. Read more about our findings.
M4I saves yearly approximately k€16 by storing their data at DataHub
DataHub provides several types of storage facilities, each with their own trade-off in costs, performance and data availability. DataHub services are free, though storage is charged at cost price. If you have less than 100 GB of data, DataHub offers you free storage. Your data will be on our highest performing storage facility. For larger data sets, several storage solutions are available. Read more.
COVID-19 data now available in our local data catalogue DISQOVER (please use VPN)
We are happy to announce that COVID-19 data is now available in our local DataHub DISQOVER instance. The collective worldwide knowledge about the COVID-19 virus and how to mitigate it is growing every day. Therefore ONTOFORCE, the provider of DISQOVER, has launched a beta release of DISQOVER COVID-19 Edition, a coronavirus-specific edition that aims to improve the organisation, integration and accessibility of rapidly evolving information related to the COVID-19 pandemic. DataHub worked hard, together with ONTOFORCE, to integrate the COVID-19 data into our local DISQOVER instance, so you can now search across linked COVID-19 data yourself.
Up-to-date information related to COVID-19 research
Updated daily, the DISQOVER COVID-19 Edition brings together siloed knowledge as it emerges around the world into one place. As the COVID-19 pandemic unfolds it can become a full-time job just to keep up with the latest developments, and the risk of making decisions on poor, out-of-date or incomplete information creates more uncertainty in an already uncertain world.
Start your search across linked COVID-19 data
DISQOVER is ideally suited to integrate accurate up-to-date information related to COVID-19 from around the world. You can simply access our local DISQOVER instance online. Login to the DataHub DISQOVER system with your institutional account. You can start your search by simply entering free-text keywords or choosing one of the displayed data categories. You will find the COVID-19 data in the COVID specific categories (COVID publications, COVID patents and COVID clinical trials).
Do you want to know more about how we use DISQOVER within DataHub? Please have a look at this portal or send an email.
New, secure service by connecting DataHub with DataverseNL. Make your research data openly available for new research.
As a result of a joined effort by DataHub and Maastricht University Library researchers using the DataHub storage services now have the opportunity to make their data collection openly available via DataverseNL. From now on, with a push of a button a researcher can copy a data collection or parts of a data collection from DataHub to DataverseNL including the descriptive metadata of the data collection. The data than can be published as open access or with restrictions (i.e. access upon request).
The process is arranged in a way that every data collection will be curated by a data steward from the University Library before the data collection is published. This minimises the risk of publishing sensitive data and increases the quality of the descriptive metadata of the data collection.
Maastricht University and Maastricht UMC+, but also more and more funders and publishers, foster Open Science and the creation of FAIR research data. DataHub and Maastricht University Library provide storage solutions and research data management services. They both work according to the FAIR principles and help you making data Findable, Accessible, Interoperable and Reusable. With this service, researchers from both Maastricht University and Maastricht UMC+ are able to meet the demands of funders and publishers to make data publically available.
People who already have an account at DataHub, can contact their data steward to get their research data project(s) enabled for this feature. All others can contact either DataHub or the University Library to get things started.
About DataHub and the University Library
DataHub provides data management services for studies in both the Faculty of Health Medicine and Life Sciences of Maastricht University and Maastricht UMC+. For security reasons the FAIR data process is focused on securely sharing data within the MUMC+ life sciences community. Nevertheless, it may be desirable to publish data collections to a larger community and to make them publically available, in accordance with ethical conditions and legal requirements.
Maastricht University Library provides DataverseNL as a solution for storage during and after the research project with the ability to make research data publically available (worldwide). Dataverse is developed by the Dataverse Team at the Institute for Quantitative Social Science (IQSS) at Harvard University and is an open source web application to share, preserve, cite, and explore research data. Sharing data in Dataverse increases visibility and researchers receive appropriate credit via a data citation with a persistent identifier. DataverseNL is a project on national level in a partnership with DANS (KNAW), 9 Dutch Universities, 2 universities of applied science and 3 research institutes.
Excited to learn more about this new service or interested in using this new service? Please contact the data steward of your faculty or contact DataHub or the University Library directly.
Coronavirus (COVID-19): DataHub infrastructure
The coronavirus affects us all. This week, the consequences have become noticeable for everyone. This exceptional situation demands a lot from us in terms of adaptability and improvisation. We would like to inform you that you can use the DataHub infrastructure as you are used to.
The DataHub services are accessible via VPN connection. You do not need a VPN connection to use the applications DMPMaastricht, Confluence and Jira. However, if you encounter any problems while using the DataHub services or if you have any questions about our services, please send an email.
- The TOPdesk self service portal offers software downloads for employees and students.
- You can find COVID-19 updates and contact information on the Maastricht University website. Please note that for the academic hospital additional measures are taken. You can find updates on the MUMC website.
- Employees can contact email@example.com for faculty specific questions.
We wish everyone a lot of strength, now and in the coming period. Take good care of yourself, your loved ones and each other.
Cyber attack update: DataHub infrastructure operational
We are happy to announce that the DataHub infrastructure is back in business. You can use our infrastructure as you were used to before the cyber attack. Since the attack, our staff have been working hard to reactivate all systems and we are pleased to announce that our services now are back online. While we were unavailable, we have also worked on new features that improve our backend functionality.
The DataHub infrastructure has been improved and we have taken measurements to reduce vulnerability.
- Measures regarding federated admin access on DataHub systems were already in place, but have been improved.
- Firewalls, local as well as central, have been configured more strictly and now also serve as a second line of defense.
- We continue to apply security updates frequently.
- A central monitoring and an alerting system for detecting suspicious activity are in place. Improvements can and will be made.
- DataHub has central configuration management of servers in place.
- Offline backups are available. Improvements can and will be made.
One of the ‘lessons learnt’ is that Maastricht University wants to focus more on awareness campaigns in order to reduce the number of successful malicious attempts. At DataHub, the security awareness level for phishing and password management is reasonably okay. However, the human factor is never safe and can always be improved, so we support the idea of a broad cyber security awareness campaign.
If you encounter any problems while using the DataHub services or if you have any questions about our services, please send an email.
Cyber attack update
On 23 December 2019, Maastricht University has been hit by a serious ransomware cyber attack. Fortunately, DataHub was not significantly affected by the attack. All research data are safe and there are no indicators of compromise. Since the attack, our staff have been working hard to reactivate all systems and we do our utmost best to get our services back online.
In the first week of January, we have assessed the state of all our DataHub servers to confirm the safe state. The DataHub servers that were affected by the ransomware are involved in workflow control and data transfer. The servers did not contain any crucial data. We are currently re-enabling our systems, however, in recovery, we depend on the availability of the corporate network and firewall configurations. The university effectuated a stricter policy regarding server requirements and reconfiguring systems to meet those requirements takes time. At the same time, we are improving the DataHub infrastructure and taking measurements to reduce vulnerability.
One of the ‘lessons learnt’ is that Maastricht University wants to focus more on awareness campaigns in order to reduce the number of successful malicious attempts. The security awareness level at DataHub for phishing and password management is reasonably okay. Still, DataHub supports the idea of a broad cyber security awareness campaign.
Measures were already in place regarding federated admin access on DataHub systems, but are now hardened to make lateral network jumps via compromised servers and accounts across DataHub systems less likely. Firewalls, local as well as central, have been configured more strictly and now also serve as a second line of defense. We continue to apply security updates frequently. A central monitoring and an alerting system for detecting suspicious activity are in place. Improvements can and will be made.
DataHub has central configuration management of servers in place.
Offline backups are available.
We will inform you as soon as there are any new developments. We estimate to be back in business very soon. Updates will be given on the DataHub portal. For questions about DataHub services, please send an email.
Although your data at DataHub was not affected by the recent cyber attack, we are currently re-enabling our systems and have our RDM portal operational as soon as possible.
DataHub software version 3.3.0 released
A major update of the DataHub infrastructure has taken place. Most of the changes involve fixes for security, stability and functionality of the backend infrastructure. We are happy to announce several improvements in the self-service portal for users of our infrastructure.
Users can now request new DataHub projects more easily through a self-service form, which replaces the existing Word document. Coupled with a revised workflow for project creation, our data stewards can easier approve and create your project which is less time consuming.
No queu for ingests
The parallel ingest functionality makes it possible to run multiple ingests at the same time. Waiting for other users to finish their ingests should only occur at very busy times.
Change project permissions. Users with the manager-role can now update or set permissions for users on their projects. Look for the blue “Change project permissions” button in the projects browser.
New functionalities in latest DataHub software version
We are happy to announce several major improvements in the latest release of the DataHub software. In this release (version 2.3.0), our primary focus has been on improving the overall user experience. On a more technical level we have improved the stability of several processes. As a result, the DataHub infrastructure is more scalable to be prepared for the future.
Based on feedback of our users, we brought more functionality into the DataHub website in the so called self-service portal. Retrieving the data uploaded by you or your colleagues has never been this easy. In this release, we introduce the Project and collection browser, that enables users to browse project and collection (meta)data directly from within the DataHub portal (see Figure 1 and 2).
Figure 1: The projects overview lists all projects the current user has access to.
Principal investigators can also see and edit financial information regarding the project.
Figure 2: Clicking on a project lists all data collections that have been ingested
Data files inside a collection can now be downloaded using the Collection browser page, which opens after clicking a collection (see Figure 3).
For downloading large files or recursive directories, we recommend using the Direct download options from the menu.
Figure 3: On the collection browser page, the collection can be traversed and individual files can be downloaded.
The semantic search platform DISQOVER has been launched for end users to search through publicly available data sets and variables used in the Maastricht Study. In this release we have added functionality to search and retrieve data sets that were ingested to the DataHub system. DISQOVER will eventually replace the old research data warehouse. (See Figure 4 and 5).
Figure 4: The DISQOVER system now contains project- and collection metadata and can be used to search for data sets ingested to the DataHub system.
Figure 5: The old Research Datawarehouse will be replaced by DISQOVER
Furthermore, there is now a tighter integration between DISQOVER, the persistent identifier landing page (Figure 6) and the collection browser.
A complete overview of all data ingest and retrieval methods is shown in Figure 7.
Figure 6: Persistent identifier landing page
Figure 7: Overview of data ingest and retrieval methods
Other technical details
- New menu structure in web portal.
- The storage capacity has been expanded by attaching new hardware resources to the iRODS system.
- Increased performance for data ingest operations.
- Various improvements in logging.
- Upgrade to iRODS 4.1.12.
- New microservice to create ePIC persistent identifiers.
- New queuing mechanism for various workflows (RabbitMQ).
- Fixed creation of ePIC persistent identifier after certificate update at SURFsara.
- Prevent unexpected reboots of Windows servers.
- No extra write on checksum operations.
- Fixed case where drop zones without creator were not handled properly.
*Special courtesy to Dr. Dennie Hebels (cBITE group, MERLN) for allowing us to use the above mentioned screenshots.
DataHub joins iRods Consortium
DataHub has joined the iRods Consortium, the foundation that leads development and support of the integrated Rule-Oriented Data System (iRODS).
DataHub software version 2.2.0 released
We are pleased to announce the latest release of the DataHub infrastructure. Most of the developments involved bug fixes or changes to the backend infrastructure. The changes that are visible for end users will be mentioned here.
Noteworthy is the expansion of our total storage capacity to 120 terabytes.
For each dataset uploaded to DataHub, Persistent identifiers (PIDs) are created. With our latest release, PIDs are now redirecting to a landing page containing basic metadata about the data set. View an example.
The semantic search platform DISQOVER has been launched for end users to search through publicly available data sets and variables used in the Maastricht Study. In an upcoming release, we will be adding functionality to use DISQOVER for in-house data sets that were uploaded by our end users.
On the ‘Browse data’ page you can find instructions on how to use webDAV methods when using large data sets.
The information that is displayed in the iRODS report that is used by our data stewards has been extended. We are currently working on self service functionality in our web portal that allows every user to get an overview of his data sets. Additionally, users with the role of principal investigator will be able to get an actual overview of the project costs involved. This will become available in the next DataHub release.
Master Person Index
Master Person Index is now used for patient data that are processed in the research project Inflammatory Bowel Disease (IBD).
The monitoring and logging mechanisms have been improved which allows our development team to respond better and faster to user generated errors.
DMPMaastricht for creating, reviewing and sharing data management plans
To support researchers to produce an effective data management plan (DMP), we have launched the web-based tool DMPMaastricht, based on the DMPonline tool. DMPMaastricht is hosted by DataHub and provided in close collaboration with MEMIC Maastricht, the Clinical Trial Center Maastricht (CTCM) and the Grants Office. We ease the burden on the researcher by providing institutional guidance and example answers.
Funding agencies are more and more expecting you to integrate data management in your research proposal. A DMP helps you to manage data, meet funder requirements and help others use your research data if shared. Some funders mandate the use of DMPMaastricht, while others point to it as a useful option.
Ease the burden on the researcher
You can download funder templates without logging in, however, DMPMaastricht provides tailored guidance and example answers from funders and your organisation, provided by DataHub, MEMIC, CTCM and the Grants Office. By using DMPMaastricht it is easy to collaborate with internal and external experts on your data management plan.
DMPMaastricht offers templates of several funding agencies, currently ZonMw, NWO and the European Commission (Horizon2020).
DataHub software version 2.1.3 released
We are happy to announce the latest release of the DataHub infrastructure. Most of the developments involved bug fixes or changes to the backend infrastructure. The changes that are visible for end users will be mentioned here. Noteworthy is the expansion of our DataHub infrastructure with the semantic search platform DISQOVER.
DISQOVER has been launched as semantic search platform for public and private research data. DISQOVER makes it possible to bring semantic searching to a wide research community. DISQOVER is user-friendly and provides an intuitive user interface. As of this moment, researchers from both the academic hospital and the university can simply access our local DISQOVER instance online. In the next release of the DataHub infrastructure, metadata from iRODS data sets will be added to the DISQOVER system. Read more about DISQOVER.
We have simplified the syntaxis of the persistent identifiers that are being generated for each data set. Previously, persistent identifiers (PID) contained an auto-generated 128-bit random identifier. With this update, we now use a combination of project and collection in order to increase human-readability. In the next release, PID landing pages will be included.
We fixed a bug for cases where Drop zones were not deleted properly after a successful ingest operation.
We fixed a bug for cases where the Cloud Browser application displayed inaccurate, and sometimes negative, values for project sizes.
We have also improved the compatibility with Internet Explorer.
The virtual machines that host DataHub production services have been migrated to new and upgraded hardware, which results in improved performance.
Expansion DataHub infrastructure with search platform
We are happy to announce the latest expansion of our DataHub infrastructure with the semantic search platform DISQOVER. DISQOVER makes it possible to bring semantic searching to a wide research community. DISQOVER is user-friendly and provides an intuitive user interface. As of this moment, researchers from both the academic hospital and the university can simply access our local DISQOVER instance online.
Coupling in-house data with public data sources via DISQOVER’s data federation greatly extends our view on the data. With DISQOVER, it becomes possible to simultaneously aggregate results from data residing at public and private sources (e.g. stored in the DataHub infrastructure) that otherwise would have to be collected or searched separately, thereby improving end-user’s efficiency.
How it all works
A typical DISQOVER workflow consists of five major steps.
You login to the DISQOVER system.
You start your search by simply entering free-text keyword(s) or choosing one of the displayed data categories.
Many categories are displayed, which are populated by about 130 public data sources (PubMed, EntrezGene, ClinicalTrials.gov and many more) and local data sets at DataHub (indicated by the MUMC+ heart icon).
Via the DISQOVER federated technology, search results are based on private data sources that reside at MUMC+ DataHub and public data sources that reside at the public endpoint of DISQOVER.
Various types of graphs of the result set can be visualised using DISQOVER’s Visual Analytics feature and if that’s not sufficient, the result set can be exported to CSV for usage in other (statistical) applications.
Expand your reach
Systems like DISQOVER generate their greatest impact through reach and accessibility. Within Maastricht UMC+, the Heart and Vascular Center, the Maastricht Study, the Maastricht Multimodal Molecular Imaging Institute (M4i) and the Institute of Data Science (IDS) are already actively using DISQOVER.
Start your disqovery today!
You can start your disqovery through the public and in-house data sets by clicking the DISQOVER menu entry on the left and logging in with your institutional account. Of course, access to some data might be limited by your authorization level.
Do you want to know more about how we use DISQOVER within DataHub? Please send an email to: firstname.lastname@example.org.
Latest release DataHub infrastructure
We are happy to announce the latest release of the DataHub infrastructure. Most of the developments involved bug fixes or changes to the backend infrastructure.
The changes that are visible for end users will be mentioned here.
IDS group members can now login and use the DataHub portal for metadata discovery.
We are also happy to welcome the DLab group as newest user group of DataHub.
We have improved the user experience on the ingest button by making it visually different from other buttons and providing a context message that indicates that the resulting operation is irreversible. Also, users will receive an error message with appropriate instructions if they try to use functionality that requires UM, azM or VPN network connectivity. Moreover, the usage of the WebDAV protocol and respective user manuals are promoted.
We continuously work on improvements and we are happy to receive your feedback for our upcoming release.
DataHub software version 2.1.0 released
We are pleased to announce the latest release of the DataHub infrastructure. Noteworthy are the improvements in the stability of the Data warehouse, the performance of the Drop zones ingest operation and new functionalities in the metadata form.
Besides various bug fixes we can also introduce new functionalities in the metadata form. The metadata form now contains direct links to knowledgebase articles about investigations and projects as well as an e-mail link for requesting additional project authorisations.
We've increased the performance of the Drop zones ingest operation. We fixed a bug for cases where Drop zones remained visible after a successful ingest and we've also improved the method with which network access rights on Drop zones are being set.
The Research Data warehouse has undergone several bugfixes and is more stable now.
Moreover, the DataHub web portal now has an option to keep your active web session. By using the ‘remember me' checkbox, you will stay logged in unless you specifically sign out.
MirthConnect has been upgraded to version 3.5.0.
We have extended the user manuals on the DataHub knowledgebase. For M4I users, we have written some specific manuals.
New functionalities DataHub infrastructure
We are very happy to announce the latest release of the DataHub infrastructure. We continuously work on improvements driven by your (research) input. Besides various bug fixes, we also introduce new functionalities in the latest release.
By using the DataHub infrastructure, you are sure to make use of an encrypted and secure connection, since frontend applications that were served over HTTP are now encrypted with TLS/SSL.
Our services are now also linked to the azM network and parts of our infrastructure are deployed on azM-hardware. This allows us to help researchers from the academic hospital who want to work with data from Maastricht University and vice versa.
As for metadata, persistent identifiers are automatically being generated from the handle.net system for each dataset/ingest. Moreover, you can now easily link a publication to your dataset, by using a Digital Object Identifier (DOI).
The DataHub DataWarehouse is now only accessible by logging on to our portal. This prevents unauthorised access to the metadata.
When uploading new datasets to the DataWarehouse, you will have to fill in an ingest form to add metadata about your dataset. The ingest form has been restyled and now also shows additional information on project level, such as project title, managers and contributors.
Both iRODS and MirthConnect have been upgraded.
iRODS has been upgraded to version 4.1.10.
MirthConnect has been upgraded to version 3.4.2.