Sensitive data management

Legal framework

Sensitive data is managed within a well-established technical and regulatory framework:

  • Technical specifications reviewed by the Confederation for the automatic exchange of data from SPSP to the Federal Office of Public Health.
  • Data Protection Impact Assessment reviewed annually with the Data Protection Officer of SIB.
  • Data Transfer and Use Agreement established with each data provider.
  • Sensitive data hosted on BioMedIT, managed by SIB.

Secure IT infrastructure

The diagram above outlines the SPSP IT infrastructure, structured across several zones with a focus on secure genomic data processing and sharing. We explain below each zone and its components:

1. SIB ACCESS ZONE (Public but filtered / whitelisting)

  • Purpose: Acts as the entry point for external institutions (e.g., reference laboratories) to upload genomic data. Also used to securely exchange data with federal authorities.

  • Components:

    • sFTP: Used by data providers to transfer files securely.

    • Protocols: Uses sftp for file transfers, HTTPS for secure communications.

    • Data Flow: Data is sent into the secure zone.

2. SIB SECURE ZONE (SENSA)

  • Purpose: Handles all sensitive data, ensures data is cleaned (removal of any human contamination) and stored securely.

  • Key Components:

    • Loader: Parses and loads data into the system.

    • Cleaner: Cleans raw data and de-identifies metadata.

    • Reporter: Prepares summaries or reports for federal authorities.

    • SPHN Connector: Connects to other BioMedIT nodes.

    • Database: spspsensitivedb stores all sensitive data.

3. SIB INTERNAL SPSP ZONE

  • Purpose: Processes and stores non-sensitive data using modular, pathogen-specific microservices.

  • Pathogen Instances (each accessed via HTTPS):

    • SCV2, VIRAL, INFLUENZA, BACTERIA, WASTEWATER, etc.

  • Databases:

    • Store cleaned, non-sensitive data for each instance.

  • Jobs Execution:

    • Handled on two HPC clusters, spsp-bio01 and spsp-bio02, both running SLURM.

    • Bioinformatics pipelines and tools centrally stored on /tools.

4. DMZ (Demilitarized Zone)

  • Purpose: Exposes controlled interfaces to users and external systems.

  • Components:

    • SPSP Portal (HTTPS): Web frontend for accessing data and dashboards.

      • /api provides access to non-sensitive data.

    • Reverse Proxy: Controls traffic between external users and internal services.

    • Keycloak: Manages authentication (SSO).

Data Flows & Security

  • Protocols (in purple):

    • sFTP, HTTPS, TCP/IP, SSH:22, SLURM REST

  • Color Coding:

    • Blue boxes: Virtual machines (VMs)

    • Orange borders: Frontend components

    • Zones: Separated by security level and function

Security Highlights

  • Strict segregation of sensitive and non-sensitive data.

  • Differentiated access control (e.g., Keycloak SSO).

  • Data passes through cleaning and de-identification before any public/internal access.

  • Integration with BioMedIT for highly secure biomedical data processing.

Table of Contents