M2 internship in bioinformatics and systems immunology

 Stage · Stage M2  · 6 mois    Bac+5 / Master   i3 Lab, UMRS959, Sorbonne University · Paris 13 (France)  15 % du plafond horaire de la sécurité sociale (4,35 €/h en 2025)

 Date de prise de poste : 2 janvier 2026

Mots-Clés

regulatory T cells (Tregs) Deep Learning Antigen Specificity Public Database Transfer Learning Bioinformatics Machine Learning RNA-seq single-cell artificial intelligence systems biology python NGS

Description

Fine-Tuning State-of-the-Art Deep Learning Models to Predict TCR Specificity Using Curated Public Datasets

Name: UMRS 959 “Immunoloregulation-Immunopathology-Immunotherapy”

Affiliation: Inserm /Sorbonne Université

Address: Hôpital Pitié-Salpêtrière, 83 boulevard de l’hôpital, Bâtiment CERVI, 4ème étage, 75013 Paris

Website: https://www.i3-immuno.fr/en/#Research

Supervisor:
Encarnita Mariotti-Ferrandiz (encarnita.mariotti@sorbonne-universite.fr)
David Klatzmann (David.Klatzmann@Sorbonne-universite.fr)

Subject keywords:
Type 1 diabetes; regulatory T cells (Tregs); T-cell receptor (TCR); Deep Learning; Antigen Specificity; Public Database; Transfer Learning; Bioinformatics; Machine Learning; RNA-seq; single-cell; artificial intelligence; systems biology; Immunology; python; NGS

Tools and methodologies:
Pre-trained deep learning models (e.g., transformers, CNNs, TCR-BERT, ESM2); fine-tuning strategies; curated public TCR datasets; negative sampling; classification metrics; Python (TensorFlow/PyTorch, scikit-learn)

Summary of lab’s interests:
The i3 laboratory is interested in systems immunology approaches to identify novel biomarkers for diagnostic and therapy with a particular focus on Treg biology and autoimmune disorders (AD). One of the unique expertise of the laboratory consists in identifying antigen-specific T cells though the analysis of T cell receptor repertoire using deep sequencing.

Summary of the proposed project:
Regulatory T cells (Tregs) are essential for maintaining immune homeostasis by suppressing self-reactive effector T cells. Their dysfunction contributes to autoimmune diseases, including type 1 diabetes (T1D). Like other T cells, Tregs are antigen-specific through their T cell receptors (TCRs), yet their antigen specificity remains poorly understood.

Predicting the specificity of TCRs is a central challenge in immunology. Several state-of-the-art (SOTA) deep learning models, such as ERGO-II (Springer et al., 2021), TULIP (Meynard-Piganeau et al., 2024), SABRE (Wang & Shen, 2023), and TITAN (Weber et al., 2021), have been developed to tackle this, trained on large public datasets of known TCR-antigen interactions. However, these models often require fine-tuning to perform optimally in specific biological contexts (e.g., murine datasets or lab-curated public datasets).

This internship project aims to:

  • Curate and clean TCR-antigen specificity data from public repositories (e.g., VDJdb, McPAS-TCR, IEDB) using in-house filtering pipelines (Jouannet et al., biorXiv, 2024).
  • Design a robust negative sampling strategy to generate realistic true-negative datasets.
  • Fine-tune SOTA deep learning models on this curated dataset to improve predictive performance in a binary TCR-antigen classification task.
  • Evaluate model generalizability and robustness across cross-validation and external datasets (including unseen TCRs and antigens).

The candidate will gain hands-on experience with large-scale biological data integration, deep learning pipelines, and interpretation of computational immunology models.

Candidate profile:
The expected candidate will have training in bioinformatics and a strong interest in computational immunology. Proficiency in Python and foundational knowledge of deep learning (e.g., using Keras, PyTorch, or TensorFlow) is required. Experience in data cleaning, biological sequence handling, or training ML models is a plus.

Lab description:
The laboratory is offering a unique interdisciplinary environment, with biologists, immunologists, clinicians, computer scientists and bioinformaticians.

Publication supervisors (related to the project):

  • Vantomme et al, biorchiv, 2025
  • Jouannet et al, bioRxiv, 2024
  • Mhanna V et al., Nat Rev Methods Primers, 2024
  • Mhanna V. et al., Cell Rep Methods, 2024
  • Le Gouge et al, MedRXiv, 2023
  • Quiniou V. et al, elife, 2023
  • Mhanna V et al, Diabetes 2021
  • Barennes P et al., Nature Biotechnologies 2020
  • Ritvo PG et al, PNAS 2018

Candidature

Procédure : Envoyer un mail à celine.albalaa@inserm.fr et encarnita.mariotti@sorbonne-universite.fr

Date limite : 30 novembre 2025

Contacts

 Celine AlBalaa
 ceNOSPAMline.albalaa@inserm.fr

 Encarnita Mariotti-Ferrandiz
 enNOSPAMcarnita.mariotti@sorbonne-universite.fr

Offre publiée le 29 juillet 2025, affichage jusqu'au 30 novembre 2025