Bioinformatician (M/F) - Description, Storage, and Standardization of Datasets and Workflows
CDD-OD · Ingénieur autre · 23 mois Bac+5 / Master Institut Pasteur · Paris (France)
Date de prise de poste : 1 octobre 2023
Workflows, annotation, standards, knowledge base
The ShareFAIR project (PEPR “Digital Health”) aims to promote the sharing and exchange of health data and their analysis, with a focus on interoperability, reusability, and transparency.
Bioinformatic analyses are complex and rely on various tools that need to be configured and chained together. In this context, improving the reproducibility of the obtained results is of paramount importance, especially in the field of health. This is typically achieved through the design, implementation, and execution of workflows (e.g. Snakemake, Nextflow), which offer numerous advantages, such as improving the reproducibility of analyses and better tracking of data provenance.
These workflows are generally scattered across public repositories, poorly annotated, and difficult to query. Challenges, therefore, include the standardization and annotation of datasets and workflows, as well as their synthesis into interoperable, shareable, and reusable workflows.
Within the scope of this project, we are seeking an engineer specialized in bioinformatics workflows, data, and knowledge engineering to contribute to the definition and implementation of standards and best practices to achieve these objectives. The successful candidate will work closely with a multidisciplinary team, including bioinformatics researchers and engineers, developers, and data management experts.
Main Missions and Activities:
- Identification of standards for the representation and annotation of workflows:
- Perform an in-depth analysis of existing standards such as RO-Crate, EDAM, and others that are relevant.
- Evaluate their applicability to the specific needs of the ShareFAIR project.
- Recommend and justify appropriate choices of standards for the representation and annotation of workflows.
- Construction of a knowledge base integrating the identified standards:
- Design and implement an infrastructure for the creation of a consolidated knowledge base, using the selected standards.
- Develop automated pipelines for the integration and management of data from different sources.
- Collaborate with the team to ensure the quality, consistency, and accuracy of data in the knowledge base.
- Adaptation and improvement of concepts borrowed from standards:
- Examine the scope and limitations (in terms of quality and coverage) of the identified standards.
- Propose improvements and adaptations to meet the specific needs of the ShareFAIR project.
- Implement these improvements in collaboration with the development team.
Bachelor’s degree (Bac +5) in computer science or bioinformatics.
The Hub of Bioinformatics and Biostatistics and Institut Pasteur are committed to promoting gender equality, and female candidates are encouraged to apply.
Required Education and Skills:
Proficiency in Python and/or Java for software development.
Solid knowledge of databases, including SQL and/or NoSQL.
Familiarity with knowledge representation formats such as JSON and RDF.
Understanding of ontologies and bioinformatics workflows (an advantage).
Ability to work independently and collaborate effectively within a multidisciplinary team.
Good communication and documentation skills.
Proficiency in professional English.
Procédure : Send CV and motivation letter by email.
Date limite : None
Hervé Ménager, Frédéric Lemoine
Offre publiée le 26 juillet 2023, affichage jusqu'au 31 décembre 2023