Mots-Clés
Bioinformatics
causal models
probabilistic causal models
machine learning
gene regulation
developmental biology
Description
Location: Institute of Molecular Genetics of Montpellier (IGMM), France
Host group: AI for Genome Interpretation (AI4GI), Dr. Raimondi
Duration: 6 months
Starting date: first half 2026
Project title
Unraveling Ubx regulation in Drosophila melanogaster using probabilistic causal models
Scientific background
Transcription factors (TFs) regulate gene expression by integrating intrinsic and environmental signals. Beyond DNA binding, some TFs—including Ultrabithorax (Ubx)—also interact with RNA and influence splicing, suggesting a role in co-transcriptional regulation.
How these regulatory layers are integrated within a coherent developmental network remains poorly understood.
Environmental stresses induce large-scale changes in transcription and splicing in Drosophila, but uncovering the underlying mechanisms requires explicit mathematical models that go beyond correlation-based analyses.
Objectives
This project aims to develop a Probabilistic Causal Model (PCM) of target gene regulatory network to:
* infer causal relationships between genes and regulatory processes,
* perform in silico hypothesis testing,
* predict the effects of genetic or environmental perturbations.
Internship program
The student will join Dr. Raimondi’s group at IGMM and work in close collaboration with Dr. Carnesecchi (IGMM) on developmental genetics and RNA biology.
Main tasks include:
- integrating large-scale Drosophila expression and splicing datasets (ModENCODE, FlyBase, time-series and interventional data),
- defining a biologically grounded causal graph of biological network,
- implementing an invertible structural causal model in Python using DoWhy,
- validating the model through conditional independence testing and graph refutation,
- querying the model to generate testable biological hypotheses (interventions, mediation analysis, causal effect decomposition).
Generated hypotheses will be experimentally tested by partner teams.
Candidate profile
MSc (M2) student in bioinformatics, computer science, computational biology, data science for biology, mathematics, statistics, software engineering
- Solid Python programming skills
- Strong interest in statistical modeling, machine learning and causal inference
- Background in molecular biology or genetics is a plus
- Motivation for interdisciplinary research
Research environment
This internship is part of a highly interdisciplinary collaboration combining bioinformatics, causal machine learning and developmental biology at IGMM. It offers strong methodological training and potential opportunities for PhD continuation.