Supporting timely responses to infectious disease outbreaks and epidemics
The Data Hubs Portal makes it easier than ever for multi-site project partners to share and access prepublication pathogen genetic sequences, in the secure environment of European Nucleotide Archive (ENA) Data Hubs. This facilitates collaboration and analyses during a project’s lifetime, and accelerates the public sharing of FAIR pathogen data once projects are completed. Supported by the Pathogen Data Network (PDN), the new portal is a major milestone in linking data to support research on infectious diseases and epidemics.
Accelerating pathogen research collaborations
Pathogen research projects often include partners from multiple locations or institutions, each with a specific role in generating and analysing data. ENA Data Hubs facilitate such collaborations by providing a managed access environment where project partners can privately share prepublication sequencing data. Depending on their role, users can create a Data Hub for a project, add or access sequencing data, and even include automated analysis and visualizations.
The new ENA Data Hubs Portal further enables such collaborations by providing a user-friendly interface for setting up, accessing, managing and searching Data Hubs. This major milestone of the PDN project will accelerate sharing of prerelease pathogen sequencing data within research consortia and other multi-site partnerships – which is essential for timely responses to infectious disease outbreaks and epidemics.
ENA Data Hubs were originally developed for foodborne infectious disease data under the EU COMPARE project, and then expanded to support COVID-19 and priority infectious diseases. Data Hubs are now available for all data across ENA. In addition to COVID-19 and foodborne pathogens, pathogen-related hubs include viral sequences from the Global Sewage Surveillance project, antimicrobial-resistant bacterial sequences from EU Reference Laboratories, clinical samples from patients with hepatitis A and norovirus, and sequences for monkeypox virus, avian influenza H5N8 viruses and emerging zoonotic Usutu virus.
Fostering open sharing of FAIR pathogen data
Data Hubs are built on top of the European Nucleotide Archive, a global repository of openly available DNA and RNA sequences hosted by EMBL-EBI. By using the same infrastructure and data standards as ENA, the hubs foster FAIR (Findable, Accessible, Interoperable and Reusable) data management from the start of each project. They also facilitate public release of data to ENA once project results are published.
Advancing the Pathogen Data Network mission
The Data Hubs Portal contributes to two PDN objectives: fostering data discovery and enabling timely data sharing. Pathogen data released from Data Hubs to ENA will also be linked and searchable on the central Pathogens Portal – PDN’s key element for providing global access to integrated infectious diseases-related data.
“By facilitating sharing of pathogen data both pre and post publication, the Data Hubs Portal advances PDN’s mission to support research and ensure efficient surveillance and outbreak responses.”
Aitana Neves
Director, Centre for Pathogen Bioinformatics
Associate Director, Clinical Bioinformatics group, SIB
In addition to support from the Pathogen Data Network, the Data Hubs Portal project received funding from the European Union’s Horizon Europe and Horizon 2020 research and innovation programs.
Workstreams involved: WS2 FAIR Data Management