Genomic-driven predictions of Southern Ocean’s primary production

 CDD · Stage M2  · 6 mois    Bac+4   LM2E (UMR 6197) · Brest (France)  ~550 euros

 Date de prise de poste : 1 février 2022


Genomics Metagenomics Machine learning Statistics Predictions Plankton Carbon Primary production Southern Ocean Marine Antarctic


Duration: 5/6 months

Location: Institut Universitaire Européen de la Mer, Plouzané (Brest), France

Revenues: ~550 euros/month

Supervision: Emile Faure (Post-doc, UBO), Lois Maignien (MC, UBO), Nicolas Cassar (Professor, Duke University), Yajuan Lin (AP, Duke Kunshan University)

Scientific context and problematics:

The field of meta-omics has rapidly evolved in the past decade, allowing the discovery of novel key organisms and functions in multiple environments, including marine ecosystems (Acinas et al., 2019; Delmont et al., 2018; Salazar et al., 2019; Sunagawa et al., 2015). Among the recent global meta-omics surveys of planktonic diversity, e.g. Tara expeditions, Malaspina or Biogeotraces, only Tara Oceans investigated the Southern Ocean, and only at 2 locations (Acinas et al., 2019; Paoli et al., 2021; Salazar et al., 2019). Between 2016 and 2017, the ACE campaign circumnavigated the Antarctic, sampling metagenomes of planktonic communities at more than 30 locations and from the surface up to 3800m depth. Over 40 environmental parameters were collected in parallel with the genomic samples during the cruise. This campaign thus offers an unprecedented opportunity to decipher the links between planktonic communities from the Southern Ocean and key biogeochemical variables.

Machine learning approaches were recently used to predict protein families abundances from environmental parameters in the global ocean (Faure et al., 2021). In another study, omics data were used to quantitatively estimate global nitrogen fixers abundance through machine learning algorithms (Tang and Cassar, 2019). These studies provided compelling evidence of the possibility to combine omics and environmental data to predict the abundance of at least some functional gene clusters. But the link between these genomic abundances and concrete biogeochemical outputs such as the efficiency of primary production and carbon export remain quite unexplored. Eukaryotic metabarcoding data were used to predict net community production at the West Antarctic Peninsula, helping to identify key taxa for carbon export (Lin et al., 2021, 2017). The ACE campaign metagenomes offer the opportunity to tackle similar questions at the gene and metabolic level.

The goal of this internship will be to build models using features from gene-matrices derived from 218 metagenomes of the ACE campaign to predict primary production related variables such as in-situ net community production, particulate organic carbon concentration, or biological silica concentration (which can be related to diatoms abundance, one of the most prominent primary producers in the Southern Ocean). More specifically, the intern will take inspiration from recent studies investigating trait prediction from metatranscriptomics data in terrestrial plants (Guadagno et al., 2020) to answer two questions:

  • Can in-situ primary production-related variables be predicted from meta-omics data, and through which method ?

  • Are some genes particularly good predictors of primary production-related variables, and can we explain this based on their functional and/or taxonomic annotation ?


    • Establish the first metagenomics-based predictive model of primary production in the Southern Ocean

    • Identify the genes and metabolic pathways that are the best predictors of primary production in the Southern Ocean


    • Background in machine learning and/or bioinformatics

    • Ability to work autonomously with R and/or Python

    • Previous experience with metagenomics data would be a plus

    • Interest for marine microbial ecology and/or biogeochemistry


    Acinas, S.G., Sánchez, P., Salazar, G., Cornejo-Castillo, F.M., Sebastián, M., Logares, R., Sunagawa, S., Hingamp, P., Ogata, H., Lima-Mendez, G., Roux, S., González, J.M., Arrieta, J.M., Alam, I.S., Kamau, A., Bowler, C., Raes, J., Pesant, S., Bork, P., Agustí, S., Gojobori, T., Bajic, V., Vaqué, D., Sullivan, M.B., Pedrós-Alió, C., Massana, R., Duarte, C.M., Gasol, J.M., 2019. Metabolic Architecture of the Deep Ocean Microbiome. bioRxiv 635680.

    Delmont, T.O., Quince, C., Shaiber, A., Esen, O.C., Lee, S.T., Rappé, M.S., McLellan, S.L., Lücker, S., Eren, A.M., 2018. Nitrogen-fixing populations of Planctomycetes and Proteobacteria are abundant in surface ocean metagenomes. Nat. Microbiol. 3, 804–813.

    Faure, E., Ayata, S.-D., Bittner, L., 2021. Towards omics-based predictions of planktonic functional composition from environmental data. Nat. Commun. 12, 1–15.

    Guadagno, C.R., Millar, D., Lai, R., Mackay, D.S., Pleban, J.R., McClung, C.R., Weinig, C., Wang, D.R., Ewers, B.E., 2020. Use of transcriptomic data to inform biophysical models via Bayesian networks. Ecol. Model. 429, 109086.

    Lin, Y., Cassar, N., Marchetti, A., Moreno, C., Ducklow, H., Li, Z., 2017. Specific eukaryotic plankton are good predictors of net community production in the Western Antarctic Peninsula. Sci. Rep. 7, 14845.

    Lin, Y., Moreno, C., Marchetti, A., Ducklow, H., Schofield, O., Delage, E., Meredith, M., Li, Z., Eveillard, D., Chaffron, S., Cassar, N., 2021. Decline in plankton diversity and carbon flux with reduced sea ice extent along the Western Antarctic Peninsula. Nat. Commun. 12, 4948.

    Paoli, L., Ruscheweyh, H.-J., Forneris, C.C., Kautsar, S., Clayssen, Q., Salazar, G., Milanese, A., Gehrig, D., Larralde, M., Carroll, L.M., Sánchez, P., Zayed, A.A., Cronin, D.R., Acinas, S.G., Bork, P., Bowler, C., Delmont, T.O., Sullivan, M.B., Wincker, P., Zeller, G., Robinson, S.L., Piel, J., Sunagawa, S., 2021. Uncharted biosynthetic potential of the ocean microbiome. bioRxiv 2021.03.24.436479.

    Salazar, G., Paoli, L., Alberti, A., Huerta-Cepas, J., Ruscheweyh, H.-J., Cuenca, M., Field, C.M., Coelho, L.P., Cruaud, C., Engelen, S., Gregory, A.C., Labadie, K., Marec, C., Pelletier, E., Royo-Llonch, M., Roux, S., Sánchez, P., Uehara, H., Zayed, A.A., Zeller, G., Carmichael, M., Dimier, C., Ferland, J., Kandels, S., Picheral, M., Pisarev, S., Poulain, J., Acinas, S.G., Babin, M., Bork, P., Boss, E., Bowler, C., Cochrane, G., Vargas, C. de, Follows, M., Gorsky, G., Grimsley, N., Guidi, L., Hingamp, P., Iudicone, D., Jaillon, O., Kandels-Lewis, S., Karp-Boss, L., Karsenti, E., Not, F., Ogata, H., Pesant, S., Poulton, N., Raes, J., Sardet, C., Speich, S., Stemmann, L., Sullivan, M.B., Sunagawa, S., Wincker, P., Acinas, S.G., Babin, M., Bork, P., Bowler, C., Vargas, C. de, Guidi, L., Hingamp, P., Iudicone, D., Karp-Boss, L., Karsenti, E., Ogata, H., Pesant, S., Speich, S., Sullivan, M.B., Wincker, P., Sunagawa, S., 2019. Gene Expression Changes and Community Turnover Differentially Shape the Global Ocean Metatranscriptome. Cell 179, 1068-1083.e21.

    Sunagawa, S., Coelho, L.P., Chaffron, S., Kultima, J.R., Labadie, K., Salazar, G., Djahanschiri, B., Zeller, G., Mende, D.R., Alberti, A., Cornejo-Castillo, F.M., Costea, P.I., Cruaud, C., d’Ovidio, F., Engelen, S., Ferrera, I., Gasol, J.M., Guidi, L., Hildebrand, F., Kokoszka, F., Lepoivre, C., Lima-Mendez, G., Poulain, J., Poulos, B.T., Royo-Llonch, M., Sarmento, H., Vieira-Silva, S., Dimier, C., Picheral, M., Searson, S., Kandels-Lewis, S., Tara Oceans coordinators, Bowler, C., de Vargas, C., Gorsky, G., Grimsley, N., Hingamp, P., Iudicone, D., Jaillon, O., Not, F., Ogata, H., Pesant, S., Speich, S., Stemmann, L., Sullivan, M.B., Weissenbach, J., Wincker, P., Karsenti, E., Raes, J., Acinas, S.G., Bork, P., Boss, E., Bowler, C., Follows, M., Karp-Boss, L., Krzic, U., Reynaud, E.G., Sardet, C., Sieracki, M., Velayoudon, D., 2015. Structure and function of the global ocean microbiome. Science 348, 1261359–1261359.

    Tang, W., Cassar, N., 2019. Data-Driven Modeling of the Distribution of Diazotrophs in the Global Ocean. Geophys. Res. Lett. 46, 12258–12269.


    Procédure : Envoyer un mail a Emile Faure ( et Loïs Maignien (

    Date limite : 8 octobre 2021


    Emile Faure

    Offre publiée le 8 septembre 2021, affichage jusqu'au 15 octobre 2021