Article Text
Statistics from Altmetric.com
Much of the research we publish relates to questions of cause and effect. In an ideal world, we would subject these questions to experimentation, randomising study participants to different conditions. However, in many cases – particularly in the context of addiction – such randomization is simply not possible. We cannot randomise tobacco-naïve children to use e-cigarettes, for example, to determine whether or not vaping acts as a ‘gateway’ to subsequent smoking. In these cases, we have to rely on observational methods, and these suffer from well described problems of confounding, including reverse causality.
Several methods exist for strengthening causal inference in such cases, from the use of prospective data and statistical adjustment for confounding, through to propensity score matching and the use of natural experiments. One method, in particular, has experienced exponential growth recently – Mendelian randomization (MR).1 2 This approach uses genetic variants as proxies for an exposure of interest, effectively as a form of instrumental variable analysis. If relevant assumptions hold, this should protect against confounding, including reverse causality, due to the random allocation of genotype at meiosis and the fact that environmental exposures cannot directly alter germline DNA sequence.3
As summary data from a vast range of genome-wide association studies (GWAS) have become widely and freely available, it has become possible to run every permutation of the vast number of exposure-outcome relationships that exist using MR methods. This, in principle, is a good thing. Although MR is not without its limitations (and critics), it is a potentially powerful tool that has provided important evidence – for example suggesting a causal effect of cigarette smoking on some adverse mental health outcomes.4 However, with the advent of platforms such as MR Base (https://www.mrbase.org), all bivariate relationships tractable via MR using summary GWAS data can be considered to have been conducted.5
Unfortunately, GWAS with ever-larger samples have enabled us to detect genetic variants associated with an exposure of interest that are also associated with a range of other exposures (and not via the exposure of interest), effectively reintroducing confounding when using MR. For example, genetic variants associated with smoking initiation are also associated with behavioural outcomes in young children, at an age before any exposure to smoking, suggesting that these variants may not be uniquely capturing smoking initiation but instead some broad risk-taking phenotype.6
In other words, while the exposure of interest may be smoking initiation, the primary phenotype most proximal to the genetic proxies used may be risk taking. This means that using these variants as a proxy for smoking initiation may introduce genetic confounding. Dynastic effects may also be operating, whereby offspring genetic variants become associated with particular environmental exposures due to parental genotypes influencing these exposures and (of course) offspring genotype. This is likely to be a particular issue in the context of substance use.
Taken together, this means that using MR in the context of complex behavioural exposures requires careful thought – the use of negative controls to exclude alternative pathways to the outcome of interest, and ideally triangulation of evidence using multiple study designs, combined with a detailed understanding of the plausible biological pathways where possible.7
Despite this, we are unfortunately seeing an ever-increasing number of MR studies that simply use summary GWAS data, and lack negative controls or evidence from other study designs and methodologies to strengthen inference. These often investigate causal pathways that are already known (eg, whether smoking causes coronary arterial disease), or exposures that simply do not lend themselves to genetic instrumentation within an MR framework (eg, skipping breakfast). Ultimately, these studies either do not advance knowledge (because the answer is already known) or offer little more than a conventional observational study. Indeed, they may offer less, and in fact have negative utility, in that they come packaged with causal claims, in a way that conventional observational studies typically do not. In this way they may actually serve to degrade knowledge.
What is driving this increase? Ultimately, it is down to current incentive structures that reward publication over knowledge.
An indicator of this is that there are now relatively few studies applying MR methods that report null results. This is ironic. A key early application of MR was to establish whether widely-reported conventional observational analyses were causal. For example, circulating C-reactive protein (CRP) is associated with coronary heart disease (CHD) and the development of novel therapeutics targeted CRP because of its presumed causal role; however, MR analyses established that CRP does not cause elevated CHD, with the likely reason for the observed association being that early stages of the atherosclerotic disease process increase CRP levels, as do many established causes of CHD such as cigarette smoking and elevated adiposity.8 9 Another key null early MR findings related to HDL cholesterol, which was widely considered to reduce CHD risk; MR and RCTs concurred in demonstrating the lack of benefit of higher HDL.10 Such null results are critical in correcting erroneous findings in observational epidemiology.
There are other ways to use Mendelian randomization that genuinely add to the sum of human knowledge. For example, Khouja and colleagues used multivariable MR to attempt to dissect the effects of nicotine and non-nicotine constituents of tobacco smoke on outcomes known to be caused by smoking.11 And Davies and colleagues triangulated evidence from MR and the natural experiment of the raising of the school leaving age in the UK to understand the causal effects of educational attainment on smoking initiation.12
MR analyses have been already used to good effect in a range of areas relevant to addictive behaviours. For example, in conventional analyses, ‘moderate’ alcohol intake is associated with reduced cardiovascular risk, when compared with abstinence or heavier drinking. These findings, repeatedly reported over several decades, achieved widespread recognition.
However, using genetic variants that strongly influence alcohol consumption, together with the natural experiment created by few women drinking in east Asian countries, genotype-predicted alcohol intake was shown to be linearly related to blood pressure; critically, this was only observed in men – the lack of a relationship among women (who drank virtually no alcohol) indicated that the genetic variants did not have an effect on blood pressure except through their relationship with alcohol intake.13 Using the same approach, alcohol was shown to have a continuous linear adverse effect on stroke risk.14 As mentioned above, MR was also used to demonstrate that HDL cholesterol – that which was supposed to mediate the protective effect of alcohol on CHD – was not actually protective. Thus, the evidence from MR studies and that from randomised controlled trials triangulated to clarify one of the most controversial issues in cardiovascular epidemiology.
We therefore suggest that MR studies that simply use summary GWAS data to estimate the causal effect of X on Y should be given a low priority for publication unless they genuinely advance knowledge. This could be achieved by exploring complex exposures and/or outcomes (including testing for possible mediation via multivariable MR), incorporating negative controls and/or evidence from other study designs such as natural experiments, and collaborating with biologists to advance plausible biological mechanisms. MR studies should also conform to the reporting guidelines in the STROBE-MR guidelines.15
We do not wish to be overly prescriptive, but ultimately if a study offers nothing more than the mechanical application of a statistical package to publicly available summary data, then it may not warrant the use of editorial and reviewer time, and journal space.
Footnotes
Contributors MRM (conceptualization [equal], writing—original draft [lead], writing—review & editing [equal]), JB (conceptualization [equal], writing—review & editing [equal]), MH (writing – review & editing [equal]), and GDS (conceptualization [equal], writing—review & editing [equal]).
Competing interests MRM leads an MRC Programme that conducts a substantial amount of MR research (MC_UU_00032/07). JB has received unrestricted funding to study smoking cessation from J&J and Pfizer, who manufacture medically licensed smoking cessation pharmacotherapies. MH has no competing interests to declare. GDS coauthored the first extended exposition of Mendelian randomization, and therefore has considerable intellectual investment in the approach and has received funding for MR studies over many years. He directs an MRC Unit that conducts a substantial amount of MR research (MC_UU_00032/01).