PhD in Machine Learning for single cell multi-omics data and cell-cell communication

 CDD · Thèse  · 36 mois    Bac+5 / Master   Institut Curie · Paris (France)  Fully-funded PhD position

 Date de prise de poste : 1 octobre 2024

Mots-Clés

causal networks machine learning single cell multi-omics data cell-cell communication neuroinflammatory disease

Description

Causal network analysis of single cell multi-omics data to dissect cell-cell communication in neuroinflammatory disease

PhD Supervisor: Hervé Isambert, DR CNRS (Institut Curie, UMR168, Paris, Isambert lab).

PhD Collaborator: Simon Fillatreau, PU-PH (Institut Necker Enfants Malades, U1151, Paris, Fillatreau lab).

 

This interdisciplinary PhD project aims to exploit novel Machine Learning methods to uncover the causal molecular networks underpinning cell differentiation, activation and communication in the brain, in close collaboration with the Fillatreau lab, Institut Necker Enfants Malades (INEM), who investigates the role of adaptive immunity in autoimmune and infectious diseases (Manfroi 2024, Shen 2014, Fillatreau 2002). More specifically, we will analyse single cell multi-omics data to dissect the complex molecular networks leading to pathogenic versus protective cell-cell interactions in neuroinflammatory disease. Single cell multi-omics technologies (such as scRNA-seq and scATAC-seq) can, in principles, scale up the discovery of novel biological processes in gene regulation and cell differentiation as well as their disease-associated disruption. However, the functional interpretation of this high-dimensional gene expression data remains challenging as it requires to distinguish causal relationships from mere correlations in single-cell multi-omics data.

 

The Isambert lab recently developed novel Machine Learning methods to learn causal networks for a broad variety of biological or biomedical datasets, from single cell expression data (Verny 2017, Sella 2018, Desterke 2020, Miladinovic 2024, Manfroi 2024) and live-cell imaging data (Simon 2024) to biomedical data (Cabeli 2020, Sella 2022, Ribeiro-Dantas 2024). These methods can learn a large class of graphical models including undirected, directed and possibly bidirected edges originating from unobserved common causes in the available dataset. The unsupervized Machine Learning approach combines multivariate information dependency metrics (Verny 2017) between heterogeneous mixed-type (continuous / categorical) variables (Cabeli 2020) with interpretable constraint-based graphical models (Li 2019, Ribeiro-Dantas 2024). In brief, it starts from a complete graph and iteratively removes dispensable edges, by assessing significant information contributions from indirect paths, while guaranteeing their consistency with the final graph. The remaining edges are then oriented based on the signature of causality in observational data (Verny 2017, Ribeiro-Dantas 2024). The resulting method (MIIC) outperforms concurrent methods on a broad range of benchmark networks, achieving better results with only ten to hundred times fewer samples and running ten to hundred times faster than the state-of-the-art methods (Verny 2017, Ribeiro-Dantas 2024, Simon 2024).

 

The Fillatreau lab recently developed an adoptive B cell therapy for a pre-clinical mouse model of multiple sclerosis, an autoimmune disease of the central nervous system (CNS) that disrupts the flow of information within the brain, and between the brain and body, leading to paralysis. The injection of IL-10-producing regulatory B cells (iBregs) into mice at the peak of the disease was discovered to have a rapid and lasting curative effect on the mice (Manfroi 2024). The analysis of scRNAseq transcriptomic data (10X Genomics) in collaboration with the Isambert lab identified microglia in the CNS as the main cell type targeted by the injected iBregs (Manfroi 2024). Causal network analysis further highlighted how this effective treatment resets microglial cells from a pathogenic autoimmune reactive state into their normal protective state, by downregulating key regulators and their downstream target genes in microglia (Manfroi 2024).

 

The present PhD project aims at extending these analyses to refine the complex molecular and cellular networks involved in this cellular therapy, based on new scRNAseq data obtained with recent highly sensitive technologies (Smartseq3 and Flashseq) providing quality data for machine learning analyses. First, we will explicitly include the cell-cell interactions between iBregs and microglia, as well as other responding cell types, by integrating intracellular network inference with intercellular information from cell-cell communication approaches, such as NicheNet, which relies on known ligand-receptor interactions. Second, we will extend the causal network analysis beyond differentially expressed genes (DEGs) to include all expressed genes captured by the highly sensitive Smartseq3 / Flashseq technologies, as we have shown, in preliminary results, that the restriction of gene networks to selected genes through DEG analysis can overlook many relevant interactions across expressed genes. The overall objective of this quantitative inference project is to uncover how the injected iBregs manage to intercept the pathogenic reaction and reset the protective function of microglia in the CNS. Ultimately, some of the most interesting predictions of novel cause-effect functional interactions at play in this cellular therapy will be tested in the Fillatreau lab through targeted gene perturbations in the relevant cell types, using specific drugs, RNA interference, gene knockout or CRISPR-based gene editing. Further perspectives for medical applications in human will be to investigate whether the molecular networks associated to iBreg treatment involve genes that have been associated with the risk of developing human autoimmune or neurodegenerative diseases, which are known to involve an inflammatory component (alzheimer, amyotrophic lateral sclerosis and the aging brain).

 

References

Cabeli V, Verny L, Sella N, Uguzzoni U, Verny M, Isambert H (2020). Learning clinical networks from medical records based on information estimates in mixed-type data. PLoS Comput Biol 16(5):e1007866.

Desterke C, Petit L, Sella N, Chevallier N, Cabeli V, Coquelin L, Durand C, Oostendorp RAJ, Isambert H, Jaffredo T, Charbord P (2020). Inferring gene networks in bone marrow Hematopoietic Stem Cell-supporting stromal niche populations. iScience 23(6):101222.

Fillatreau S, Sweenie CH, McGeachy MJ, Gray D, Anderton SM (2002). B cells regulate autoimmunity by provision of IL-10. Nature Immunol. 3(10):944-50.

Li H, Cabeli V, Sella N, Isambert H (2019). Constraint-based causal structure learning with consistent separating sets. Advances in Neural Information Processing Systems (NeurIPS) 32, 14257-14266.

Manfroi B, Dang VD, Jungmann A, Borzakian S, Beauvineau C, Dupuis L, Guffart E, Borst K, Chyzak G, Bui-Thi C, Tong Y, Frischbutter S, Hamann A, Nguyen NT, Salem Wehbe L, El Behi M, Luka M, Ménager M, Jouneau L, Boudinot P, Jung S, Isambert H, Prinz M, von Kries P, Specker E, Walter J, Mahuteau-Betzer F, Fillatreau S (2024). Induced regulatory B cells stably expressing IL-10 cure CNS autoimmunity by targeting microglia, Science, under revision.

Miladinovic O, Canto P-Y, Pouget C, Piau O, Radic N, Freschu P, Megherbi A, Prats CB, Jacques S, Hirsinger E, Geeverding A, Dufour S, Petit L, Souyri M, North T, Isambert H, Traver D, Jaffredo T, Charbord P, Durand C (2024). A multistep computational approach reveals a neuro-mesenchymal cell population in the embryonic hematopoietic stem cell niche. Development 151, dev.202614.

Ribeiro-Dantas M, Li H, Cabeli V, Dupuis L, Simon F, Hettal L, Hamy AS, Isambert H (2024). Learning interpretable causal networks from very large datasets, application to 400,000 medical records of breast cancer patients, iScience, in press.

Sella N, Verny L, Uguzzoni G, Affeldt S, Isambert H (2018). MIIC online: a web server to reconstruct causal or non-causal networks from non-perturbative data. Bioinformatics 34(13):2311-2313.

Sella N, Hamy AS, Cabeli V, Darrigues L, Laé M, Reyal F, Isambert H (2022). Interactive exploration of a global clinical network from a large breast cancer cohort. npj Digital Med. 5, 113.

Shen P, Roch T, Lampropoulou V, O'Connor RA, Stervbo U, Hilgenberg E, Ries S, Dang VD, Jaimes Y, Daridon C, Li R, Jouneau L, Boudinot P, Wilantri S, Sakwa I, Miyazaki Y, Leech MD, McPherson RC, Wirtz S, Neurath M, Hoehlig K, Meinl E, Grützkau A, Grün JR, Horn K, Kühl AA, Dörner T, Bar-Or A, Kaufmann SHE, Anderton SM, Fillatreau S (2014). IL-35-producing B cells are critical regulators of immunity during autoimmune and infectious diseases. Nature 507(7492):366-370.

Simon F, Comes MC, Tocci T, Dupuis L, Cabeli V, Lagrange N, Mencattini A, Parrini MC, Martinelli E, Isambert H (2024). CausalXtract: a flexible pipeline to extract causal effects from live-cell time-lapse imaging data. eLife, in press.

Verny L Sella N, Affeldt S, Singh PP, Isambert H (2017). Learning causal networks with latent variables from multivariate information in genomic data. PLoS Comput Biol 13(10):e1005662.

Candidature

Procédure : Profile : Applicants should have a Master’s degree in Bioinformatics or Computational Biology with a deep interest and good skills in data analysis. Applicants with a ML/AI/CS background and a clear interest plus a first experience in biological data analysis are also welcome. The PhD fellowship is funded for 3 years by the Fondation pour le Recherche Médicale (starting Sept-Oct 2024). Applicants should send a detailed CV including University transcripts with marks and the emails of two references to herve.isambert@curie.fr.

Date limite : 31 mai 2024

Contacts

Hervé Isambert

 heNOSPAMrve.isambert@curie.fr

 http://kinefold.curie.fr/isambertlab/PAPERS/PhD_project_FRM_2024_3.pdf

Offre publiée le 5 avril 2024, affichage jusqu'au 31 mai 2024