M2 internship: Phylogenomic analysis of regulatory sequence evolution in teleost fish

 Stage · Stage M2  · 6 mois    Bac+4   IBENS - UMR8197 - Equipe DYOGEN · Paris (France)

 Date de prise de poste : 3 janvier 2022


comparative genomics evolution, gene regulation


Phylogenomic analysis of regulatory sequence evolution in teleost fish

During the last 10 years, tremendous progress in genome sequencing technologies has delivered hundreds of high quality genomes across diverse taxa, resulting in an unprecedented amount of information to study the evolution of genes and genomes. With this data, combined with the massive use of genome-wide assays to probe the function of sequences, we are in a position to integrate evolutionary with functional analyses to answer important biological questions. One of the most challenging questions in genomics today is to understand how gene expression is regulated by so-called enhancer sequences, usually distributed in the genomic vicinity of their target gene. This question is important not only because the evolution of regulatory sequence is key to the emergence of functional innovations during the adaptation of species in changing environments, but also because misregulation of gene expression is implicated in numerous diseases.

The DYOGEN group at IBENS has a strong expertise in the use of comparative genomics to explore questions linked to genome evolution and function, mainly along two axes of research. The first axis consists in developing methods to reconstruct ancestral genome organisation in order to understand how chromosomes are rearranged during evolution, allowing a better understanding of gene evolutionary histories. These include AGORA [in preparation], PhylDiag[1], Genomicus[2] and SCORPiOS[3]. The second axis consists in using large-scale comparisons across dozens of genomes to predict the position of regulatory sequences and to identify the genes that they regulate, through the application of a new method called PEGASUS[4].

So far, we have applied PEGASUS to 35 species in the tetrapod clade and, at a smaller scale, to 7 fish species. This latter number is clearly not sufficient to get a broad enough view of enhancers among the huge diversity of fish, especially in the light of their sometimes complex genomes. The proposed project aims first at applying PEGASUS to more than 50 fish genomes, then at using the predicted enhancers and target genes to gain knowledge on the organization of fish genomes and the dynamics of their evolution. Fish represent an interesting model to explore biological questions linked to regulatory sequences. First, in contrast to other vertebrates such as mammals and birds, the fish lineage has been subject to several whole genome duplications (WGD) since their radiation about 350 million years ago. A first WGD took place at the origin of teleosts (most of the ~60.000 species of fish known today) and several taxon-specific WGD occurred later such as in the salmonids and the cyprinids. These major events in the evolution of genomes lead to a doubling of all chromosomes and genes, followed by a reduction of gene number by massive gene loss. A widely accepted model proposes that gene regulatory landscapes play a role in determining whether genes will be retained or lost; one aim of the project is to test if this model is supported. Second, several fish species are important experimental models, such as zebrafish, medaka, trout, xiphophorus, and others are economically valuable, such as salmon, turbot and carp. A better understanding of the genetic determinants governing their development and physiology, assisted by vast amounts of functional data, is thus a potential outcome of the project.

The project will exploit extensive genome-wide datasets such as multiple genome alignments, ancestral genome reconstructions, gene expression data, epigenetic marks, chromatin accessibility and 3D genome organisation in several model species. The DYOGEN team participates in several relevant projects, such as the GenoFish ANR project (https://anr.fr/Project-ANR-16-CE12-0035), the AQUA-FAANG European project (https://www.aqua-faang.eu), and a more recent ongoing initiative aiming at producing thousands of high quality reference genomes from marine species in Europe.

Candidates for this project should show a keen interest in subjects related to evolution and genomics in general and display some competence in programming skills (R and/or Python).

Project supervision will be carried out by François Giudicelli (CR Inserm, Dyogen group) and Hugues Roest Crollius (DR CNRS, Dyogen group).

1. Lucas JM, Muffato M, Roest Crollius H. PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees. BMC Bioinformatics 2014; 15:268

2. Muffato M, Louis A, Poisnel CE, et al. Genomicus: a database and a browser to study gene synteny in modern and ancestral genomes. Bioinformatics 2010; 26:1119–21

3. Parey E, Louis A, Cabau C, et al. Synteny-guided resolution of gene trees clarifies the functional impact of whole genome duplications. Mol. Biol. Evol. 2020;

4. Clément Y, Torbey P, Gilardi-Hebenstreit P, et al. Enhancer-gene maps in the human and zebrafish genomes using evolutionary linkage conservation. Nucleic Acids Res. 2020; 48:2357–2371


Procédure : Envoyer un mail de candidature avec CV, lettre de motivation à : Hugues Roest Crollius: hrc@bio.ens.psl.eu et François Giudicelli: francois.giudicelli@bio.ens.psl.eu

Date limite : 3 janvier 2022


Hugues Roest Crollius


Offre publiée le 18 août 2021, affichage jusqu'au 3 janvier 2022