Diversity and organization of alpha satellite DNA in Cercopithecini

 Stage · Stage M2  · 6 mois    Bac+5 / Master   Lab. Structure et instabilité des génomes - Muséum National d'Histoire Naturelle · Paris (France)  approx. 540 euros /month

 Date de prise de poste : 3 janvier 2022

Mots-Clés

Repeated DNA, evolutionary genomics, phylogeny, centromere, Principal Component Analysis

Description

We nopw have obtained high-thoughput sequencing datasets coming from 18 cercopithecini species (illumina paired-ends, 2x150 bp, coverage >30x), as well as to long read sequencing data (PacBio) for two of those species, C cephus and C solatus. The purpose of the training will be to explore those data in order to characterize alpha satellite DNA in both those species. Specifically developped classification algorithms (Haschka et al 2021) and alignment tools will be used to explore the diversity (classification in families) and the spatial organization (localization of monomers with respect to each other and along the chromosomes) of these sequences. The evolutionary history of monomers will be inferred from phylogenetic methods and analyzed taking into account the phylogenetic relationships between species. The associations of alpha satellite DNA with other types of repeated DNA will be studied as well. This work will lead to the development of new tools for data analysis and the obtained results will be used for understanding the evolutionary mechanisms of those DNA sequences as well as their contribution to centromere function and genome regulatory mechanisms.

Alpha satellite DNA is the most abundant family of tandemly repeated sequences in numerous primate genomes. 171 bp monomers extend over millions of base pairs, mainly in the centromeric regions of chromosomes. Several subfamilies can be distinguished that differ in their organization and chromosomal distribution.The ARChE team aims at understanding the mechanisms that sustain the peculiar evolution of those sequences. We focus on Cercopithecini, a clade of Old World monkeys with numerous species that have diverged in a few million years. Using targeted sequencing, bioinformatic analysis and cytogenetic experiments, we recently provided evidence for a great diversity of sequences and organizational patterns in two cercopithecini species (Cacheux et al, 2016 and 2018).


We npw have obtained high-thoughput sequencing datasets coming from 18 cercopithecini species (illumina paired-ends, 2x150 bp, coverage >30x), as well as to long read sequencing data (PacBio) for two of those species, C cephus and C solatus. The purpose of the training will be to explore those data in order to characterize alpha satellite DNA in both those species. Specifically developped classification algorithms (Haschka et al 2021) and alignment tools will be used to explore the diversity (classification in families) and the spatial organization (localization of monomers with respect to each other and along the chromosomes) of these sequences. The evolutionary history of monomers will be inferred from phylogenetic methods and analyzed taking into account the phylogenetic relationships between species. The associations of alpha satellite DNA with other types of repeated DNA will be studied as well. This work will lead to the development of new tools for data analysis and the obtained results will be used for understanding the evolutionary mechanisms of those DNA sequences as well as their contribution to centromere function and genome regulatory mechanisms.

Candidature

Procédure :

Date limite : 15 décembre 2021

Contacts

Loïc Ponger

 poNOSPAMnger@mnhn.fr

Offre publiée le 1 octobre 2021, affichage jusqu'au 20 décembre 2021