Mots-Clés
Molecular dynamics
Deep Learning
Protein flexibility
Large-scale analysis
Description
Protein flexibility is essential to its biological function. While the recent revolution in structural bioinformatics, driven by the release of AlphaFold2, has significantly democratized access to static three-dimensional protein structures, the large-scale analysis and prediction of protein dynamics remains one of the most important challenges in the field.
In 2023 our group developed a comprehensive database of molecular dynamics (MD) simulations called ATLAS (1), which compiles detailed information on protein flexibility for a representative set of protein structures. The article describing ATLAS rapidly became one of the most cited recent papers in Nucleic Acids Research, highlighting its strong impact within the field. Its development allowed us to re-evaluate the relation of protein flexibility and pLDDT score reflecting the confidence of AF predictions (2) and was extensively used by the community to develop various bioinformatics tools, such as AlphaFlow and Boltz-2. In the meantime, we have used ATLAS to develop a protein language model-based tool for prediction of protein flexibility at different scales directly from protein sequence PEGASUS (3) improving performance of the previous state-of-the-art tool also suggested by our team MEDUSA (4).
The student will have an opportunity to contribute to the following axes actively explored in the framework of the further project development:
Advanced MD trajectory analysis: implementation of new metrics and comparative approaches to characterize protein flexibility and structural variability;
Exploration of new protein families and folds: integration of novel AI-predicted folds, as well as intrinsically disordered proteins, into the ATLAS dataset;
Enhanced MD sampling and extension of simulation lengths and refinement of the protocols to improve the representativeness of conformational ensembles;
Development of new deep learning models: design and implementation of new architectures to predict dynamic and flexibility-related properties directly from protein sequence and structure data.
The project thus combines molecular modelling, data analysis, structural bioinformatics, and machine learning, offering the student both theoretical and practical experience in structural biology.
References:
Vander Meersche Y, Cretin G, Gheeraert A, Gelly J-C, Galochkina T. ATLAS: protein flexibility description from atomistic molecular dynamics simulations. Nucleic Acids Res 52(D1), D384–D392 (2024). DOI: 10.1093/nar/gkad1084
Vander Meersche Y, Diharce J, Gelly J-C, Galochkina T. Flexibility or uncertainty? A critical assessment of AlphaFold 2 pLDDT. Structure (2025). DOI: 10.1016/j.str.2025.09.001
Vander Meersche Y, Duval G, Cretin G, Gheeraert A, Gelly J-C, Galochkina T. PEGASUS: Prediction of MD-derived protein flexibility from sequence. Protein Science 34:e70221 (2025). DOI: 10.1002/pro.70221
Vander Meersche Y, Cretin G, de Brevern A G, Gelly J-C, Galochkina T. MEDUSA: Prediction of Protein Flexibility from Sequence, J Mol Biol 433(11), 166882 (2021). DOI: 10.1016/j.jmb.2021.166882