Development of an emBASE-Galaxy bridge

Type de poste
Dates
Durée du poste
Contrat renouvelable
Contrat non renouvelable
Date de prise de fonction
Date de fin de validité de l'annonce
Localisation
Adresse

<pre wrap="">
Genome Biology, EMBL, Heidelberg</pre>

Heidelberg
France

Contacts
Charles Girardot
Email du/des contacts
charles.girardot@embl.de
Description
Next generation sequencing (NGS) is the key technology to analyse the transcriptome, determine DNA-binding protein maps, interrogate sequence variants and sequence the genome of new organisms or individuals. The Genome Biology Computational Support aims at providing the whole Genome Biology unit with support in term of NGS data storage and analysis. We use emBASE, a local branch of the open source BASE platform, to store and annotate both microarray and NGS data. Data analysis is performed on a high performance cluster and managed using a local Galaxy instance. In this project, we propose to improve the communication between emBASE and Galaxy. In particular, the project aims at enabling emBASE to add data libraries in Galaxy, launch predefined workflows (e.g. QC analysis, demultiplexing, read mapping) and transfer results back to emBASE for long term archiving (QC reports, demultiplexed datasets). These functionalities will be developed using the Galaxy API and web services. The project also includes analytical activities like creation of new analysis workflow, addition/development of the necessary tools and ensuring their smooth integration with the compute cluster. The successful candidate will have a practical knowledge of Python, SQL, must be able to work on linux servers and not be reluctant to server administration. Knowledge of R and Bioconductor is a strong plus. Knowledge of other languages like PHP, Java, Perl and javascript are more than welcome. The candidate will work together with software developers and bioinformaticians of the Genome Biology unit. Skills like organisation, commitment and rigour are essential.