Mendelian Randomization Studies: Recent Methodological Developments

Even with careful study design, observational epidemiology is prone to confounding, bias, and reverse causation.1-3 As a consequence, observational studies have generated unreliable estimates of causality in associations between risk factors and disease. A successful approach to overcome limitations inherent in observational studies and improve causal inference is Mendelian randomization (MR).4,5

As MR has been applied many times, the principle, motivation and key assumptions have been explained in detail previously3,6-14 and application to cardiovascular disease reviewed.4,15,16 Briefly, MR exploits the essentially random allocation of alleles at conception as the basis of a natural experiment analogous to a randomized controlled trial (RCT) whereby a genetic variant is used as a proxy for a clinically relevant risk factor for disease (Figure 1).3,17 At a population level, the genotype is largely independent of confounding factors and unmodified by the development of disease.3,18 Therefore, the portion of variance in the modifiable risk factor explained by the genetic variant, unlike the direct measurement of the risk factor, is free of the limitations that would otherwise plague observational studies. Evidence for a causal association between a risk factor and an outcome is provided if the genetic variant, reliably associated with the risk factor, is independent of all measured or unmeasured confounding factors and is associated with the outcome only through the risk factor.8

Figure 1: Basic Principles of MR

Figure 1
The risk factor (E) is causally associated with the outcome (O) if the following conditions are held: (1) the genetic variant (G) is a valid instrument, in that it is reliably associated with E; (2) there is no independent association with O, except through E; and (3) the instrument is independent of any measured or unmeasured confounding factors.
Modified from Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet 2014;23:R89-98.

Recent methodological developments have expanded the basic principle of MR to further understand causal associations between traits and health outcomes. Here, we review these developments, presenting applications within cardiovascular and cardiometabolic health. MR is usually performed using a single variant whose biological effect on the risk factor is well understood. For example, individuals carrying a rare proprotein convertase subtilisin/kexin type 9 (PCSK9) allele, which lowers low-density lipoprotein (LDL) to below the population average (by 21-38mg/dL), showed a 40-80% lower incidence of myocardial infarction (MI).19 However, effects of common genetic variants are small and prone to weak instruments, which at worst can bias estimates towards a confounded observational estimate. Increasing the number of genetic variants included increases the variance explained in a given trait and can, therefore, improve the precision of the causal estimate.12-14,20,21

However, several limitations are introduced by using multiple variants. The possibility for pleiotropic effects is increased, violating the conditions for being a valid instrument for the trait.6 Therefore, genetic variants are usually only used as instruments if they are reliably detected, replicated, and, if possible, the biological mechanism by which the genetic variant influences the trait is known.3 In this scenario, multiple variants can actually be utilized to interrogate the presence of pleiotropy in which consistent estimates derived from multiple and independent pathways can confirm a fundamental biological relationship. This can be seen illustrated in the use of different loci in assessment of the contribution of LDL cholesterol (LDL-C) to coronary heart disease while estimates remain essentially the same.13

Obvious beneficial developments in the method involve the application of MR within two independent samples. The high costs or lack of appropriate measurements leads to relatively small datasets with either trait or gene measured, but not both. In this case, an estimate of the first-stage, genotype-trait association is obtained from one sample and applied to a second, which has only genotype and outcome measured.3 For example, the estimate of the genetic effect on C-reactive protein (CRP) obtained in one large population of European men was applied to a second collection of observational studies to investigate the potential causal association between circulating levels of CRP and the risk of non-fatal MI.22 This method has also been applied within the same sample, which has been randomly split.23

Reciprocal MR can be helpful to further untangle the directional association between a risk factor and outcome (Figure 2). For example, in a study of 5,804 elderly individuals within the Prospective Study of Pravastatin in the Elderly at Risk (PROSPER) study, allelic scores for CRP and adiposity were related to circulating levels of CRP and leptin, and body mass index (BMI).24,25 The CRP allelic score was associated with a decrease in CRP levels but not with measures of adiposity. Conversely, the adiposity allelic score was associated with an increase in BMI, circulating leptin levels and CRP levels, suggesting that elevated CRP levels, as a marker of inflammation, are generated by greater adiposity rather than the reverse. When used in situations in which there is little understanding of the biological effect of a variant on a trait, reciprocal MR can be potentially misleading.3

Figure 2: Bidirectional MR

Figure 2
If a trait (T1) is causally associated with another (T2) then the genetic variant associated with T1 (G1) will be associated with both T1 and T2. However, the reverse (in red) will not be true and the genetic variant associated with T2 (G2) will not be associated with T1 (unless the relationship is truly bidirectional).
Modified from Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet 2014;23:R89-98.

Reciprocal MR can be extended to explore the causal directions between networks of correlated variables.26,27 This method can be used to assess the potential mediation between environmental exposures and disease, a method termed 'two-step MR'.3,28-30 For example, higher BMI may increase the risk of coronary heart disease in part through its effect on blood pressure (BP).31,32 To date, two-step MR has been applied within epigenetic epidemiology (Figure 3); for example, it has been used to assess the mediation effect of methylation in the association between postnatal growth and childhood adiposity33 and red blood cell folate and cord blood methylation.34

Figure 3: Two-Step MR Applied to Epigenetic Epidemiology

Figure 3
In step 1 (left diagram), one SNP (G1), independent of any confounding factors (C), is used as a proxy for the environmental exposure (E) of an outcome (O) via differential methylation (M). G1 will only influence M if E is causally related to M (red line). In the second step (right diagram), another independent SNP (G2) is similarly used as a proxy for M to assess the causal association between M and O (red line).
Modified from Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet 2014;23:R89-98.

As nature's RCT,35 MR is a robust method used to improve causal inference in observational associations, which are otherwise hampered by limitations that produce unreliable estimates of causality. The methodological developments discussed here and future directions, including hypothesis-free, factorial, and multi-phenotype MR,3 will further contribute to the identification of causal pathways between clinically relevant environmental exposures and pharmacological targets for prevention of adverse cardiometabolic health.


  1. Davey Smith G, Ebrahim S. Data dredging, bias, or confounding. BMJ 2002;325:1437-38.
  2. Davey Smith G, Ebrahim S. Epidemiology – is it time to call it a day? Int J Epi 2001;30:1-11.
  3. Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet 2014;23:R89-98.
  4. Timpson NJ, Wade KH, Davey Smith G. Mendelian randomization: application to cardiovascular disease. Curr Hyp Rep 2012;14:29-32.
  5. Davey Smith G, Ebrahim S. 'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epi 2003;32:1-22.
  6. Davey Smith G, Ebahim S. Mendelian randomization: prospects, potentials, and limitations. Int J Epi 2004;33:30-42.
  7. Sheehan NA, Didelez V, Burton PR, et al. Mendelian randomisation and causal inference in observational epidemiology. PLOS Med 2008;5:e177.
  8. Lawlor DA, Harbord RM, Sterne JAC, et al. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med 2008;27:1133-63.
  9. Bochud M, Rousson V. Usefulness of Mendelian randomization in observational epidemiology. Int J Env Res Pub Health 2010;7:711-28.
  10. VanderWeele TJ, Tchetgen TE, Cornelis M. Methodological challenges in Mendelian randomization. Epidemiology 2014;25:427-35.
  11. Davey Smith G. Use of genetic markers and gene-diet interactions for interrogating population-level causal influences of diet on health. Genes Nutr 2011;6:27-43.
  12. Lawlor DA, Nordestgaard BG, Benn M, et al. Exploring causal associations between alcohol and coronary heart disease risk factors: findings from a Mendelian randomization study in the Copenhagen General Population Study. Eur Heart J 2013;34:2519-28.
  13. Ference BA, Yoo W, Alesh I, et al. Effect of long-term exposure to lower low-density lipoprotein cholesterol beginning early in life on the risk of coronary heart disease. J Am Coll Cardiol 2012;25:2631-9.
  14. Vimaleswaran KS, Cavadino A, Berry DJ, et al. Association of vitamin D status with arterial blood pressure and hypertension risk: a mendelian randomisation study. Lancet Diabetes Endocrinol 2014;2:719-29.
  15. Jansen H, Samani NJ, Schunkert H. Mendelian randomization studies in coronary artery disease. Eur Heart J 2014;35:1917-24.
  16. Davey Smith G, Timpson NJ, Ebrahim S. Strengthening causal inference in cardiovascular epidemiology through Mendelian randomization. Ann Med 2008;40:524-41.
  17. Davey Smith G, Lawlor DA, Harbord RM, et al. Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology. PLOS Med 2007;4:1985-92.
  18. Ebrahim S, Davey Smith G. Mendelian randomization: can genetic epidemiology help redress the failures of observational epidemiology? Hum Genet 2008;123:15-33.
  19. Cohen JC, Boerwinkle E, Mosley TH Jr, et al. Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N Engl J Med 2006;12:1264-72.
  20. Brion MJ, Shakhbazov K, Visscher PM. Calculating statistical power in Mendelian randomization studies. Int J Epi 2013;42:1497-1501.
  21. Voight BF, Peloso GM, Orho-Melander M, et al. Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study. Lancet 2012;380:572-80.
  22. Casas JP, Shah T, Cooper J, et al. Insight into the nature of the CRP-coronary event association using Mendelian randomization. Int J Epidemiol 2006;35:922-31.
  23. Richmond R, Davey Smith G, Ness AR, et al. Assessing causality in the association between childhood adiposity and physical activity levels: a Mendelian randomization analysis. PLOS Med 2014;11.
  24. Welsh P, Poliseki E, Robertson M, et al. Unraveling the directional link between adiposity and inflammation: a bidirectional Mendelian randomization approach. J Clin Endogrin Metab 2010;95:93-9.
  25. Timpson NJ, Nordestgaard B, Harbord RM, et al. C-reactive protein levels and body mass index: elucidating direction of causation through reciprocal Mendelian randomization. Int J Obes (Lond) 2011;35:300-8.
  26. Chaibub Neto E, Keller MP, Attie AD, et al. Causal graphical models in systems genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Ann Appl Stat 2010;4:320-39.
  27. Peng CH, Jiang YZ, Tai AS, et al. Causal inference of gene regulation with subnetwork assembly from genetical genomics data. Nucleic Acids Res 2014;42:2803-19.
  28. Relton CL, Davey Smith G. Two-step epigenetic Mendelian randomization: a strategy for establishing the causal role of epigenetic process in pathways to disease. Int J Epidemiol 2012;41:161-76.
  29. van Os J, Rutten BPF, Poulton R. Gene-environment interactions in schizophrenia: review of epidemiology findings and future directions. Schizophr Bull 2008;34:1066-82.
  30. Davey Smith G. Random allocation in observational data: how small but robust effects could facilitate hypothesis-free causal inference. Epidemiology 2011;22:460-3.
  31. Timpson NJ, Harbord R, Davey Smith G, et al. Does greater adiposity increase blood pressure and hypertension risk? Mendelian randomization using Fto/Mc4r genotype. Hypertension 2009;54:84-90.
  32. The International Consortium for Blood Pressure Genome-Wide Association Studies. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 2011;478:103-9.
  33. Groom A, Potter C, Swan DC, et al. Postnatal growth and DNA methylation are associated with differential gene expression of the TACSTD2 gene and childhood fat mass. Diabetes 2011;61:391-400.
  34. Binder AM, Michels K. The causal effect of red blood cell folate on genome-wide methylation in cord blood: a Mendelian randomization approach. BMC Bioinformatics 2013;14:353.
  35. Hingorani A, Humphries S. Nature's randomised trials. Lancet 2005;366:1906-8.

Clinical Topics: Dyslipidemia, Geriatric Cardiology, Atherosclerotic Disease (CAD/PAD), Lipid Metabolism, Nonstatins, Novel Agents, Statins

Keywords: Adiposity, Aged, Alleles, Blood Pressure, Body Mass Index, C-Reactive Protein, Cholesterol, LDL, Coronary Artery Disease, Coronary Disease, Environmental Exposure, Epidemiologic Studies, Epigenesis, Genetic, Erythrocytes, Fetal Blood, Folic Acid, Genotype, Incidence, Inflammation, Leptin, Lipoproteins, LDL, Methylation, Myocardial Infarction, Negotiating, Obesity, Phenotype, Pravastatin, Proprotein Convertases, Prospective Studies, Random Allocation, Risk Factors, Subtilisins, Viverridae

< Back to Listings