Internship: Single-Nucleotide Variants detection on whole bacteria genomes.

 Stage · Stage M1  · 3 mois    Bac+4   LESAFFRE INTERNATIONAL · Lille (France)

 Date de prise de poste : 1 octobre 2021


Bacteria SNP Genomics variant calling


Within Lesaffre there is a need to be able to routinely compare whole bacterial genomes in order to state if isolates are identical or not. When establishing the Lesaffre strain collection multiple samples have been used to isolate strains from. Though these samples are physically separated they may still contain the same, or a closely related strain. In addition, the specific isolation and screening procedure employed during isolation will select for the best performing strain from different samples. This phenotypic selection may result in a selection of a genetically similar or identical strain.  To develop the best market proposition truly different strains need to be evaluated for their functionality in the final application. Whole genome based genetic analysis will be instrumental in selecting truly different strains.

Besides the above application the strain comparison tool can also be employed to detect in house cross contamination of specific strains or to identify which strain have been used in competitor products.

Within this project you will be attached to the Data Science team to develop a data analysis pipeline that will allow the routine comparison of bacterial strains down to a single nucleotide level.  You will work with representatives from the Lesaffre bacteria platform for support on the biological interpretation of the data and the formulation of their needs as an internal client.

Technically you will evaluate multiple approaches, SNP or WGS-MLST based, for their functionality and scalability. Secondly, time permitting you will design a reporting suite including strain distance visualisation. Finally, you will automate the analysis pipeline for routine analysis (working with tools such as workflow management system, containers, git) and apply it to the full whole genome bacterial data set of Lesaffre.


Procédure :

Date limite : 30 septembre 2021


Fabien Pichon

Offre publiée le 3 septembre 2021, affichage jusqu'au 30 septembre 2021