Détection des ARN non-codant dans les séquences génomiques. Application au génome de Ralstonia solanacearum

Informations générales
Détails de la thèse/HDR
Tom Coenye
Alain Denise
Hélène Touzet
Matthieu Arlat
Gwennaele Fichant
Directeur (pour les thèses)
Christine Gaspin
Résumé en anglais
 Recently, noncoding RNAs (ncRNAs) have emerged as key regulators in control of diverse cellular processes both in procaryotes and eucaryotes. Despite a great number of noncoding RNA known today, no universal feature allowing their reliable prediction has been found. Nevertheless, it is known that in archean A+T rich thermophiles ncRNA detection is possible on the basis on their elevated G+C contents. On the other hand, there are no studies exploring the compositional properties of noncoding RNA in G+C rich genomes.  Here we study the noncoding RNA detection in Ralstonia solanacearum G+C rich beta-proteobacterium in which no previous systematic search of noncoding RNAs had been undertaken. We first studied the existence of the compositional bias in ncRNAs in A+T rich bacterium Staphylococcus aureus.

From the methodological point of view, this work resulted in proposition of a procedure for testing the G+C bias in different genomes features, and noncoding RNA in particular, based on the Generalised Linear Modelling. We show that S. aureus ncRNAs, as well as some repeat sequences, are caracterised by a significant compositional bias which can be used for their detection. The same approach was less succesiful  when applied on R. solanacearum genome.
Complementary to the compositional bias approach, we used the comparative genome analysis between different strains of R. solanacearum in order to detect conserved noncoding RNA. During this work, we developed a new version of RNAsim, a tool using graph theory approach in order to predict conserved intergenic regions in multiple genomes.