Bravo à Aurélien Birer et Pierre Andrieu qui ont remporté les prix poster décerné par la SFBI !
Le poster d'Aurélien :
Le parcours d'Aurélien :
Le poster de Pierre:
The aim of biological data ranking is to help users faced with huge amount of data and choose between alternative pieces of information. This is particularly important when querying biological data integration systems, where even very simple queries can return thousands of answers. For instance, searching for the set of human genes involved in breast cancer returns thousands of answers in the reference database EntrezGene without any ranking in terms of importance. The need for ranking solutions, able to order answers, is crucial for helping scientists to organize their time and prioritize the new experiments to be possibly conducted. However, ranking biological data is a difficult task for various reasons: biological data are usually annotation files which reflect expertise, they thus may be associated with various degrees of confidence; the need expressed by
scientists may also be taken into consideration whether the most well-known data should be ranked first, or the freshest, etc. As a consequence, although several ranking methods have been proposed in the last years within the bioinformatics community, none of them has been deployed on systems currently in use.
The approach we propose to follow is to rank biological data by considering two steps. First, several ranking methods are applied to biological data (results are ordered using alternative ranking criteria). Second, we use consensus ranking methods reflecting the input rankings’ common points while not putting too much importance on elements classified as ”good” by only one or a few rankings. The problem, known as the median problem for a set of rankings, isNP-hard. However, since providing a consensus ranking is a crucial need for big biological data sets, designing scalable algorithms is highly challenging. Besides, the problem has been mainly studied in the case of permutations where elements are strictly ordered while in real applications some elements may be placed at the same position (considered as equally important). The challenge is then to design an algorithm computing one consensus ranking from a set of rankings with ties.
We introduce a new algorithm computing a consensus ranking from a set of rankings with ties. The originality of our approach lies in providing an efficient solution (i) based on a graph decomposition of the datasets to partition it efficiently and (ii) having several interesting and fundamental properties, which allow to evaluate the relevance of a given solution and able to provide the exact consensus in many cases. A set of experiments has been conducted on several hundreds of biological and synthetic data sets. First results appear to be very promising, making our algorithm able to compete with the best currently available algorithms while beingefficient enough to be used on real settings in particular as the algorithm used on http://conqur-bio.lri.fr/.
Le parcours de Pierre:
Peu de temps après avoir obtenu le concours de pharmacie, je me suis orienté vers une licence de mathématiques puis un master de bioinformatique. Actuellement en stage à l'université Paris-Sud sur la thématique de l'agrégation de classements appliquée aux données biologiques, je suis sur le point de commencer une thèse dans la continuité du sujet de stage.