Project
One striking observation today in the field of human genetics is that as Research advances to understand the genetic architecture of complex traits and to apprehend the etiology of heritable diseases, new paradigms keep emerging revealing more and more of the complexity of biological models. Indeed, the human genome is composed of about 20,000 genes if we consider the coding parts of the DNA, which is hardly more than the worm Caenorhabditis elegans for example. Thus, the complexity of the human organism, i.e. the great diversity of cell types and functions of the organism, must result rather from very high combinatorics and fine-tuned regulations of the expression of these genes. Therefore, mechanically, each genetic element (e.g. variant, gene) is expected to influence several traits. This phenomenon is called pleiotropy.
Although pleiotropy is extremely common and thought to play a central role in the genetic architecture of human complex traits and diseases, it is one of the least understood phenomena.
One of the most compelling lines of evidence supporting pleiotropy is provided by Genome-wide associations studies (GWASs) which consist in estimating the effect of genome-wide genetic variants on a studied trait. GWASs have yielded to the identification of countless genetic variants significantly associated with many complex traits and diseases, most certainly because of pleiotropy, and without being able to pinpoint a causal mechanism in the vast majority of cases. Therefore, many applications and method development have successfully reused the results of GWASs principally to study relationships between traits. One booming field using GWASs summary statistics data is causal inference between traits in the form of Mendelian randomization. The principle of Mendelian randomization is very simple and analogous to randomized control trials where the effects of variant alleles (instead of drug/placebo) are modeled through regression to estimate and test the causal effect of an exposure trait on an outcome trait. Although extremely appealing, Mendelian randomization relies on a strong assumption: the absence of horizontal pleiotropy occurring when a variant has independent effects on both the exposure and the outcome. Pleiotropy tended to be neglected in Mendelian randomization applications. In a stepping-stone paper published in Nature Genetics in 2018, we have shown that horizontal pleiotropy cannot be neglected and occurs in almost 50% of causal relationships, biasing causal estimates and inflating the false discovery rate of causal relationships.
On a related topic, in 2019, we have published a proof-of-concept paper in Genome Biology to, not only detect horizontal pleiotropy, but to show that pleiotropy can be quantified at the level of the genetic variants themselves. We have shown that pleiotropy is widespread across the human genome.
Today we intend to go further, we have conceptualized 5 biological mechanisms leading to pleiotropy 1) linkage disequilibrium; 2) causality between traits; 3) genetic correlation between traits; 4) high polygenicity of traits; 5) horizontal pleiotropy (true independent effects of a variant on two traits). We propose to build a comprehensive framework to disentangle all 5 states of pleiotropy and provide a genome-wide map of pleiotropy for genetic variants and to infer causal relationships between traits using machine learning. Specifically, we propose 1) to improve on a method that we have published the proof-of-concept paper using unsupervised approaches based on penalized methods, random forests or deep learning; 2) to explore semi-supervised learning using a creative strategy to label data that we have developed. There is a growing utility for Human genetic variant databases, from the interpretation of genetic analyses to clinical interpretation. We strongly believe that a database describing the pleiotropic nature of variants will complement existing databases and serve the community. Importantly, the full code of the produced methodology and the genome-wide map of pleiotropy will be made publicly available and highlighted in scientific publications.
References related to the postdoc position
Verbanck, Marie, Chia-Yen Chen, Benjamin Neale, and Ron Do. 2018. “Detection of Widespread Horizontal Pleiotropy in Causal Relationships Inferred from Mendelian Randomization Between Complex Traits and Diseases.” Nature Genetics, April, 1. doi:10.1038/s41588-018-0099-7.
Daniel M. Jordan, Marie Verbanck, and Ron Do. 2019. “HOPS: A Quantitative Score Reveals Pervasive Horizontal Pleiotropy in Human Genetic Variation Is Driven by Extreme Polygenicity of Human Traits and Diseases.” Genome Biology 20 (1): 222. doi:10.1186/s13059-019-1844-7.
Morrison, Jean, Nicholas Knoblauch, Joseph H. Marcus, Matthew Stephens, and Xin He. 2020. “Mendelian Randomization Accounting for Correlated and Uncorrelated Pleiotropic Effects Using Genome-Wide Summary Statistics.” Nature Genetics, May, 1–8. doi:10.1038/s41588-020-0631-4.
Darrous, Liza, Ninon Mounier, and Zoltán Kutalik. 2020. “Simultaneous Estimation of Bi-Directional Causal Effects and Heritable Confounding from GWAS Summary Statistics.” medRxiv, January, 2020.01.27.20018929. doi:10.1101/2020.01.27.20018929.