Thèse de en Bioinformatique

 CDD · Thèse  · 36 mois    Bac+5 / Master   Centre de Biology computationnelle · Paris 06 (France)

 Date de prise de poste : 1 septembre 2026

Mots-Clés

Cancer Evolution Copy number alteration Machine learning

Description

Scientific background

High grade serous ovarian cancer (HGSOC) remains one of the deadliest gynecologic malignancies, with a 5-year survival rate of only 43%, primarily due to late-stage diagnosis and the lack of effective early detection strategies [1]. However, early-stage detection raises survival to over 85%, underscoring the critical need to identify reliable early markers and to elucidate the molecular and cellular mechanisms underlying disease initiation [1]. A major barrier to achieving this lies in our incomplete understanding of the earliest events that drive the transformation from normal to pathological cell states. Recent studies have indicated that structural copy number alterations in particular can stratify clones by aggressiveness using shallow whole genome sequencing [2]. However, such efforts have focused on advanced tumors, leaving the earliest molecular alterations preceding tumor onset unexplored.
The fallopian tube epithelium (FTE) is now widely regarded as the site-of-origin for HGSOC. However, many HGSOC cases lack detectable precursors. This raises the possibility that clonal expansion may begin in morphologically normal yet molecularly aberrant FTE cells. The frequent observation of multiple precursor lesions in individual cases [3] further suggests that transformation is not driven by a single clonal event but may involve a broader field of at-risk clones. Carriers of germline BRCA1/2 mutations are at markedly increased risk for developing HGSOC and frequently undergo prophylactic salpingectomy (surgical removal of the fallopian tubes), which reduces their risk by up to 80% [4]. These individuals offer a particularly potent lens through which early transformation can be examined. Their FTE, although histologically normal, may already harbor molecular and phenotypic abnormalities long before the development of precursor lesions or malignancy becomes apparent.
Recent work has revealed that somatic alterations commonly found in cancer can also be detected in normal cells, transforming our understanding of somatic evolution. Studies in histologically normal tissues, including the skin bronchial epithelium esophagus and gastric epithelium have demonstrated that aging and environmental exposures can lead to the accumulation of somatic mutations previously considered cancer-specific, contributing to reshaping the way we think about cancer [5]. However, these studies have been limited in three important ways: (i) they primarily focused on single nucleotide variants (SNVs), with little attention to SCNAs; (ii) they relied on bulk sequencing, which limits resolution in polyclonal tissues, or in vitro single cell-derived clonal expansions that can bias toward specific clones; and (iii) they lacked phenotypic or spatial-morphological context. Crucially, no such study has been conducted in the normal human FTE, which is inherently polyclonal and lacks clear anatomical boundaries that could guide clonal analysis using traditional bulk sequencing approaches. The question of whether SCNA mosaicism occurs in morphologically normal FTE is of particular interest, given that SCNAs are a defining genomic feature of HGSOC.
We hypothesize that clonal expansions driven by genomic alterations arise in phenotypically normal FTE prior to the appearance of histopathologically detectable precursor lesions, and that these early genomic changes are associated with subtle but quantifiable morphological manifestations that may not yet meet diagnostic criteria. We further propose that these “at-risk” cell populations can be identified and characterized through an integrated, multi-modal approach. This project has a specific focus on somatic copy number alterations (SCNAs), a hallmark of HGSOC [6] that remains understudied in normal tissue mutagenesis.
To test the above hypothesis, we propose to use the FTE of gBRCA1/2 mutation carriers to characterize the earliest genomic and phenotypic alterations that precede visible precursors. These tissues, collected during prophylactic removal of the fallopian tubes as a risk-reducing strategy, provide a unique and underexploited resource to study clonal evolution in a high-risk but still non-malignant context.

PhD Objectives

In the context of the Evovair Project, the proposed PhDs objectives are to study early molecular alterations in FTE samples collected by our partners at Charité Hospital in Berlin. For the very same patients, they will produce both whole genome sequencing, spatial transcriptomic and histopathology slide images.
The first task of this PhD project will be to analyse the generated data in order to identify the molecular alteration in the healthy fallopian tube epithelial cells. Using evolutionary approaches, he/she will reconstruct the clonal evolution of the cells to identify clonal expansion. The goal will also be to identify molecular alterations that influence epithelial cells’ ability to expand. Using spatial transcriptomic data and histopathology slide images, he/she will also study the link between molecular changes identified in whole genome sequencing data and phenotypic changes at the transcriptomic and morphologic levels, as well as their potential impact on cells’ immediate environment.
Required skills:
Candidates should have a background in statistics, Machine learning or bioinformatics, and a strong interest both in method development and biological and medical applications.
Prior experience in analysis of next generation sequencing data and/or in computer vision would be a plus but is not mandatory.

Scientific environment

The Center for Computational Biology (CBIO) is a research center at Mines Paris; it is affiliated with its Department « Mathematics and Systems » and the joint unit “Computational Oncology (U1331)” with Institut Curie and INSERM. The CBIO develops methods in artificial intelligence, machine learning, and computer vision for applications in life sciences, covering a wide range of applications from fundamental biology to clinical applications. CBIO’s collaborations allow it to work on data from various sources, such as DNA sequencing technologies, spatial transcriptomics, protein structures, large-scale microscopy, medical imaging, and electronic health records. The CBIO develops innovative mathematical methods and algorithms to analyze these massive, heterogeneous, and complex data, thus addressing biological or clinical questions. The CBIO is involved in several major initiatives in France, both for methodological development in AI and its applications in health.

Supervision
The PhD would be co-supervised by T. Walter, Professor in Machine learning and Computer vision at Ecole des Mines and F. Massip, chargé de recherche in Bioinformatics at Ecole des Mines.

Funding
The funding of this PhD is granted via the ANR project EVOVAIR

References:

[1] Torre LA, Trabert B, DeSantis CE, et al. Ovarian cancer statistics, 2018. CA Cancer J Clin. 2018;68(4):284-296. doi:10.3322/caac.21456
[2] Wang Y, Douville C, Chien YW, et al. Aneuploidy Landscape in Precursors of Ovarian Cancer. Clin Cancer Res. 2024;30(3):600-615. doi:10.1158/1078-0432.CCR-23-0932
[3] Gross AL, Kurman RJ, Vang R, Shih IM, Visvanathan K. Precursor lesions of high-grade serous ovarian carcinoma: morphological and molecular characteristics. J Oncol. 2010;2010:126295. doi:10.1155/2010/126295
[4] Daly MB, Pal T, Berry MP, et al. Genetic/Familial High-Risk Assessment: Breast, Ovarian, and Pancreatic, Version 2.2021, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw. 2021;19(1):77-102. doi:10.6004/jnccn.2021.0001
[5] Lopez-Bigas N, Gonzalez-Perez A. Are carcinogens direct mutagens? Nat Genet. 2020;52(11):1137-1138. doi:10.1038/s41588-020-00730-w
[6] The Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature. 2011;474(7353):609-615. doi:10.1038/nature10166

Candidature

Date limite : 13 mars 2026

Contacts

 Florian Massip
 flNOSPAMorian.massip@minesparis.psl.eu

Offre publiée le 13 février 2026, affichage jusqu'au 13 mars 2026