Bioinformatics pipelines

HPC clusters

SPSP currently relies on two HPC clusters managed by SIB (running SLURM).

Viruses

SARS-CoV-2

We run V-pipe for genome assembly and QC, followed by nextclade and pangolin for annotation (updated on a daily basis). If the nextclade tool/datasets or pangolin version change, both tools are re-run on the entire SARS-CoV-2 data present on SPSP.

All the tools are containerized with Singularity and deployed using CI/CD.

Influenza and RSV

We run IRMA followed by VADR and internal QC. To identify resistance mutations in Influenza from the WHO lists, we have developed an in-house pipeline called FluR, freely available.

All the tools are containerized with Singularity and deployed using CI/CD.

Bacteria and fungi

Bacterial and fungi genomic data are analysed using the IMMense pipeline developed at the Institute of Medical Microbiology of the University of Zurich.

This nextflow pipeline notably covers:

  • Preprocessing
  • Pre-Assembly QC
  • Assembly
  • Post-Assembly QC
  • Taxonomy and typing
  • Genome annotation
  • Genome inspection (antimicrobial resistance genes, virulence factors)
  • Species-specific modules (e.g. for Listeria or Mycobaterium tuberculosis).

SPSP automatically mirrors the latest releases from IMMense and deploys them on a development server using Gitlab CI/CD. After testing and validation, new releases may be deployed on the SPSP production server using CI/CD. All the tools are containerized with Singularity.

Table of Contents