Mots-Clés
regulatory T cells (Tregs)
Deep Learning
Antigen Specificity
Public Database
Transfer Learning
Bioinformatics
Machine Learning
RNA-seq
single-cell
artificial intelligence
systems biology
python
NGS
Description
Fine-Tuning State-of-the-Art Deep Learning Models to Predict TCR Specificity Using Curated Public Datasets
Name: UMRS 959 “Immunoloregulation-Immunopathology-Immunotherapy”
Affiliation: Inserm /Sorbonne Université
Address: Hôpital Pitié-Salpêtrière, 83 boulevard de l’hôpital, Bâtiment CERVI, 4ème étage, 75013 Paris
Website: https://www.i3-immuno.fr/en/#Research
Supervisor:
Encarnita Mariotti-Ferrandiz (encarnita.mariotti@sorbonne-universite.fr)
David Klatzmann (David.Klatzmann@Sorbonne-universite.fr)
Subject keywords:
Type 1 diabetes; regulatory T cells (Tregs); T-cell receptor (TCR); Deep Learning; Antigen Specificity; Public Database; Transfer Learning; Bioinformatics; Machine Learning; RNA-seq; single-cell; artificial intelligence; systems biology; Immunology; python; NGS
Tools and methodologies:
Pre-trained deep learning models (e.g., transformers, CNNs, TCR-BERT, ESM2); fine-tuning strategies; curated public TCR datasets; negative sampling; classification metrics; Python (TensorFlow/PyTorch, scikit-learn)
Summary of lab’s interests:
The i3 laboratory is interested in systems immunology approaches to identify novel biomarkers for diagnostic and therapy with a particular focus on Treg biology and autoimmune disorders (AD). One of the unique expertise of the laboratory consists in identifying antigen-specific T cells though the analysis of T cell receptor repertoire using deep sequencing.
Summary of the proposed project:
Regulatory T cells (Tregs) are essential for maintaining immune homeostasis by suppressing self-reactive effector T cells. Their dysfunction contributes to autoimmune diseases, including type 1 diabetes (T1D). Like other T cells, Tregs are antigen-specific through their T cell receptors (TCRs), yet their antigen specificity remains poorly understood.
Predicting the specificity of TCRs is a central challenge in immunology. Several state-of-the-art (SOTA) deep learning models, such as ERGO-II (Springer et al., 2021), TULIP (Meynard-Piganeau et al., 2024), SABRE (Wang & Shen, 2023), and TITAN (Weber et al., 2021), have been developed to tackle this, trained on large public datasets of known TCR-antigen interactions. However, these models often require fine-tuning to perform optimally in specific biological contexts (e.g., murine datasets or lab-curated public datasets).
This internship project aims to:
- Curate and clean TCR-antigen specificity data from public repositories (e.g., VDJdb, McPAS-TCR, IEDB) using in-house filtering pipelines (Jouannet et al., biorXiv, 2024).
- Design a robust negative sampling strategy to generate realistic true-negative datasets.
- Fine-tune SOTA deep learning models on this curated dataset to improve predictive performance in a binary TCR-antigen classification task.
- Evaluate model generalizability and robustness across cross-validation and external datasets (including unseen TCRs and antigens).
The candidate will gain hands-on experience with large-scale biological data integration, deep learning pipelines, and interpretation of computational immunology models.
Candidate profile:
The expected candidate will have training in bioinformatics and a strong interest in computational immunology. Proficiency in Python and foundational knowledge of deep learning (e.g., using Keras, PyTorch, or TensorFlow) is required. Experience in data cleaning, biological sequence handling, or training ML models is a plus.
Lab description:
The laboratory is offering a unique interdisciplinary environment, with biologists, immunologists, clinicians, computer scientists and bioinformaticians.
Publication supervisors (related to the project):
- Vantomme et al, biorchiv, 2025
- Jouannet et al, bioRxiv, 2024
- Mhanna V et al., Nat Rev Methods Primers, 2024
- Mhanna V. et al., Cell Rep Methods, 2024
- Le Gouge et al, MedRXiv, 2023
- Quiniou V. et al, elife, 2023
- Mhanna V et al, Diabetes 2021
- Barennes P et al., Nature Biotechnologies 2020
- Ritvo PG et al, PNAS 2018