Mots-Clés
histone,
chromatin states,
lactylation,
acetylation
epigenomic
spermatogenesis
post-translational modification
IA
ChromHMM
Convolutional Neural Network,
dimensionality reduction
Description
Bioinformatics thesis
Integrated effects of histone protein acetylation and lactylation, and of the 3D structure of the genome, on gene expression regulation
Thesis supervisors: Delphine Pflieger and Thomas Fortin, BGE (Biosciences and Bioengineering for Health) laboratory, CEA Grenoble.
Collaboration with Christophe Battail (CEA Grenoble) and Julie Cocquet (Institut Cochin, Paris).
In all cells of a eukaryotic organism, DNA wraps around histone proteins to form chromatin. The dynamic modification of histones by various chemical structures allows for the fine regulation of gene expression. Today, a wide range of structures derived from cellular metabolites have been described as modifying histone lysines, generally to induce transcription. These include acetylation, which has been studied for over 60 years, and lactylation, which was only described in 2019 as being induced by glycolysis [1]. In addition, chromatin folds to bring gene promoters into contact with distant regulatory regions called enhancers. Thus, gene expression is finely regulated by combinations of histone marks at gene promoters and enhancers, and by the 3D structure of the genome. This thesis project aims to characterize the dynamic impact of these epigenomic features on transcription regulation in the context of murine spermatogenesis.
Spermatogenesis is the process by which male germ cells differentiate into spermatozoa [2–4]. It is a remarkable model for studying the complex mechanisms regulating gene expression in relation to chromatin dynamics and metabolic disturbances. Male reproduction is one of the physiological processes affected by metabolic disturbances: obesity is a risk factor for male infertility and leads to less efficient spermatogenesis [5]. Furthermore, spermatogenesis is highly dynamic from a transcriptional point of view [6] and involves stages of dramatic chromatin remodeling leading to the formation of spermatozoa.
A wide variety of epigenomic data have been accumulated in the context of spermatogenesis, enabling advanced bioinformatic analyses. Recently, we highlighted that lactylation modifies a large number of lysines in histones H3 and H4 in mouse testes [7]. In order to study the cellular functions of lactylation in relation to acetylation, we recently used Cut&Tag sequencing technology to establish the distribution of these two marks across the genome, at the level of three lysine residues, in the cell stages known as spermatogonia, spermatocytes, and round spermatids. We also acquired three-dimensional chromatin structure information at these stages using Promoter Capture HiC (PCHi-C). Quality controls and harmonization have been performed on all of this data. The aim of this thesis is to position acetylated and lactylated marks within the complex landscape of modifications that constitute the “histone language,” whose combination at the promoter of each gene and at enhancers finely modulates gene expression.
As a first step, the software ChromHMM [8,9] will be used to identify potential novel chromatin states from the epigenomic datasets generated by the team, particularly those related to histone lactylation. This approach, based on Hidden Markov Model, enables the inference of chromatin states and the analysis of their association with gene expression.
However, this methodology relies on the linear organization of the genome and does not account for the 3D structure of chromatin, which can bring genomically distant regions into close spatial proximity. To incorporate this dimension, we propose to develop a machine learning model—potentially based on Convolutional Neural Network—trained on chromosomal interaction data derived from Promoter Capture Hi-C. This approach should enable the identification of chromatin states that account for the 3D architecture of the genome and thereby refine our understanding of transcriptional regulation mechanisms. Finally, the interpretation of the structures learned by the model will be facilitated through dimensionality reduction techniques such as Topological Data Analysis.
Research program:
- Analysis of lactylation dynamics between stages of differentiation
- Prediction of chromatin states using machine learning
- Integration of lactylations with other marks and the 3D structure of chromatin
Methodology, study techniques, resources:
- Bioinformatics methods in transcriptomics and epigenomics
- Bioanalysis: statistical approaches to unsupervised classification, regression and association testing, multidimensional data visualization (heatmaps), gene ontology analysis
- Use of existing machine learning models (in particular ChromHMM [8,9]) to define chromatin states and development of an original model integrating chromatin 3D structure information.
- Computing resources: CEA Grenoble and UGA (Gricad) clusters
Candidate profile:
Master’s degree or engineering school in bioinformatics. Experience in processing and interpreting RNA-seq, ChIP-seq, and/or Cut&Tag data will be an asset. Strong interest in epigenetic regulatory mechanisms and in developing original tools using modeling.
Application deadline: April 20th, 2026. Please send your CV, a motivation letter, reference letters from previous internship supervisors and your marks of the two latest years (M1/M2 or 4th and 5th year of engineering school) to delphine.pflieger@cea.fr and thomas.fortin@cea.fr
Thesis start date: 01/10/2026
Monthly salary: about 1880 Eur netto
References
1. Zhang, D. et al. Metabolic regulation of gene expression by histone lactylation. Nature 574, 575–580 (2019).
2. El Kennani, S. et al. Systematic quantitative analysis of H2A and H2B variants by targeted proteomics. Epigenetics Chromatin 11, (2018).
3. Crespo, M. et al. Multi-omic analysis of gametogenesis reveals a novel signature at the promoters and distal enhancers of active genes. Nucleic Acids Res. 48, 4115–4138 (2020).
4. Blanco, M. et al. DOT1L regulates chromatin reorganization and gene expression during sperm differentiation. EMBO Rep. 24, e56316 (2023).
5. Carrageta, D. F. et al. Signatures of metabolic diseases on spermatogenesis and testicular metabolism. Nat. Rev. Urol. 21, 477–494 (2024).
6. Soumillon, M. et al. Cellular Source and Mechanisms of High Transcriptome Complexity in the Mammalian Testis. Cell Rep. 3, 2179–2190 (2013).
7. Manessier, J. et al. Both L-lactyl and D-lactyl enantiomers modify histones in mouse testis. 2025.10.09.681385 Preprint at https://doi.org/10.1101/2025.10.09.681385 (2025).
8. Ernst, J. & Kellis, M. Chromatin-state discovery and genome annotation with ChromHMM. Nat. Protoc. 12, 2478–2492 (2017).
9. Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).