FAIRification of pathogen bioinformatics resources under the Centre for Pathogen Bioinformatics

In this project, SPSP works towards implementing FAIRification of its public data for machine-discoverability as well. A documented public API was released and a SPARQL endpoint is under development.

    • FAIR
    • Swissuniversities
    • 30.06.2025

Project description

The COVID-19 pandemic has highlighted the importance of genomic surveillance of rapidly changing pathogens but also exposed shortcomings in the way such data are processed, shared,
and linked. The Centre for Pathogen Bioinformatics has the goal to provide key tools, expertise, and infrastructure to process, analyse, store and share genomic data of pathogens in a way
that respects the interests of data generators and privacy concerns, while allowing actionable inference, maximising data reuse, and open-ness.
We maintain several renowned resources (Swiss Pathogen Surveillance Platform (SPSP), Nextstrain, Nextclade, GenSpectrum, CoVariants, V-pipe, COJAC and LolliPop) that are used
individually in strategically important national or international pathogen surveillance projects. Here, we propose to make these tools interoperable and exploit their complementarities as part
of our ambition to provide state-of-the-art solutions for pathogen genomics.
To enable tool integration and data sharing, we will build upon the first prototype instance of our open-source database Loculus such that the database will couple data sub-mission to analy-
sis and quality control via Nextclade and accept waste-water data from COJAC and LolliPop. GenSpectrum will interface with Loculus to allow flexible querying and CoVariants for variant
frequency analysis. SPSP will implement a SPARQL endpoint for API access and an interface to Loculus. Loculus will allow for restricted data sharing and will release data to INSDC data-
bases eventually.

Together, these data portals and analysis platforms are core in enabling reproducible analysis and real-time surveillance of pathogens and already play important roles at the national and
international levels. By further strengthening this ecosystem through increased interoperability and by providing an integrated system for analysis and sharing of pathogen genomic data, we
aim to catalyse genomic epidemiology globally.