Mots-Clés
Arabidopsis thaliana
Transposable Elements
DNA Methylation
Gene Expression
GWAS
Post-GWAS
Epigenetics
Description
Genome-Wide Association Studies of Transposable Element Polymorphisms and Epigenetic Regulation in Arabidopsis Thaliana
Supervisors
Chloé-Agathe Azencott, Katia Antonenko
Center of Computational Biology, Mines Paris PSL
Context and Motivation
Genome-wide association studies (GWAS) have become an essential tool to identify genetic variants correlated with phenotypic traits. However, conventional GWAS approaches mainly focus on single-nucleotide polymorphisms (SNPs), often overlooking other forms of structural variation that can have major functional impacts, and, in particular, Transposable Element Insertion Polymorphisms (TIPs).
Transposable elements constitute a significant fraction of the A. Thaliana genome, as well as of the human genome, and can profoundly influence gene expression and genome stability through both local (cis) and distant (trans) effects. Moreover, transposable elements are tightly regulated by DNA methylation, a key epigenetic mechanism that can spread to neighboring genomic regions and modulate gene expression patterns. Consequently, variation in transposable elements insertions and their associated epigenetic states represents an underexplored layer of genomic diversity that could explain phenotypic traits not captured by SNP-based GWAS.
This internship project aims to perform genome-wide association tests to take into account TIPs, the methylation status of inserted transposable elements and its spreading to nearby regions, as well as a post-GWAS integrative analysis of findings in A. Thaliana, incorporating gene expression data to investigate both cis- and trans- regulatory effects. The ultimate goal is to better understand how transposable element polymorphisms, combined with epigenetic modifications, contribute to gene regulation and phenotypic variation.
Data and Resources
The project will leverage different levels of data, including:
- Genome sequences from multiple accessions of A. Thaliana sequenced with long-reads technology.
- Annotated transposable elements, including family, type, and insertion or deletion location.
- Gene annotation data throughout all accessions.
- Methylation profiles, describing the methylation status of individual cytosines across the genome in three different contexts (CG, CHG, and CHH).
- Gene expression data across the same accessions, providing quantitative expression levels for all genes.
These datasets are complementary and collectively enable a comprehensive view of the interplay between genetic variation, epigenetic regulation, and transcriptional output.
Profile of the Candidate
Education: Master’s student (M1 or M2) in bioinformatics, computational biology, computer science, machine learning, or related disciplines.
Skills: Programming experience (Python); familiarity with Linux environment (bash) and statistical tests.
Bonus: Prior experience with GWAS, methylation data, or TE analysis is appreciated, but not required.
Supervision and Environment
The student will be hosted within a computational biology research team (CBIO). The student will be provided with computational resources, mentoring, regular discussions with our collaborators in plant biology from IBENS, and opportunities to discuss results in lab meetings and seminars.
Duration
4–6 months (flexible, depending on academic calendar).
Compensation
Minimum statutory honorarium for internships as defined by the law.