In this commentary we consider the validity of tobacco industry-funded research on the effects of standardised packaging in Australia. As the first country to introduce standardised packs, Australia is closely watched, and Philip Morris International has recently funded two studies into the impact of the measure on smoking prevalence. Both of these papers are flawed in conception as well as design but have nonetheless been widely publicised as cautionary tales against standardised pack legislation. Specifically, we focus on the low statistical significance of the analytical methods used and the assumption that standardised packaging should have an immediate large impact on smoking prevalence.
- Tobacco industry
- public policy
- packaging and labelling
Statistics from Altmetric.com
An increasing number of countries are considering legislation to enforce standardised tobacco packaging (also known as ‘plain packaging’) consistent with the Framework Convention on Tobacco Control provision to prevent false, misleading or deceptive labelling.1 Tobacco industry resistance to such legislation is increasing, as demonstrated by their funding of two working papers which claim that standardised packaging has had no impact on smoking prevalence in Australia, the first country to have successfully introduced this measure.2 ,3 The first working paper focused on children aged 14–17 years and the second on adults. The former has been criticised thoroughly elsewhere,4–6 but although the authors reject these criticisms,7 the second shares many of the same flaws and biases. There are two core issues to consider. The first is that the introduction of standardised packaging is not expected to have a sudden impact on smoking prevalence, but rather to impact on the rate of smoking uptake. The second is that both papers are not in fact powered to detect any plausible impact. We examine the arguments in detail and focus on the second paper, on adults, though our observations apply to both.
Both working papers use the Roy Morgan Single Source (Australia) data set, from January 2001 to December 2013, and calculate monthly smoking prevalence over the period. The authors assert that the data for the working paper on children are publicly available, although they have to be purchased. They note that data for the adult working paper were “provided to us by Philip Morris International.”
In both papers, the authors first construct a linear time trend of the decline in smoking prevalence over an extended period before the introduction of standardised packaging, which has been driven by a range of tobacco control measures. The papers then assess whether the observed decline after the standardised packaging measure was introduced significantly exceeds the level predicted by the linear trend. To do so they assess whether the deviations from this trend are different in the year after the legislation compared to the year before, and finally they compute 90% CIs around the monthly prevalence figures to see if these span the extended trend line. These working papers show that the declining trend in smoking prevalence averaged 0.48 percentage points per year at all ages in Australia over the period (and 0.44% for children) and continues to fall at a similar rate after the introduction of standardised packaging. This is interpreted as a lack of evidence for effectiveness of the measure.
Such a continued decline is likely a positive sign given that, as prevalence reaches very low levels, the proportion of more addicted smokers who find it more difficult to quit increases.8 This therefore means that a continuation of the same rate of decline in this context, rather than a levelling off, in fact likely represents an increase in effectiveness of tobacco control measures. Any determination of statistical power in the analysis presented needs to rest on the assumption that the new legislation would increase the rate of decline, rather than simply retaining it.
A sudden decline in smoking prevalence was not envisaged in this legislation,9 with any effect expected to be predominantly around initiation of smoking rather than rates of quitting. Experts who had commented on the legislation prior to implementation (in a research paper cited by the authors in their working paper on adults) estimated that even 2 years would be too short a timespan in which to witness the full impact of standardised packaging measures.10 Hence, the immediate effect being sought is implausible, partly due to these time lags and partly due to the inadequacy of prevalence as a measure in this context. The nature of addiction is such that any impact on quitting is likely to take place over several years and a small increase in the rate of decline may be more realistic than an immediate dramatic fall in prevalence.
Even if the assumption of an immediate decline is accepted, the analysis of statistical power in both papers is opaque. The authors claim that ‘the power of our inference models is remarkably high’, supporting their argument with results of Monte Carlo simulations, and calculating a power of 0.85 to detect a 0.5% reduction in prevalence. These calculations are based on achieving statistical significance in any 1 of 12 tests, and the authors fail to emphasise that these high-power calculations result from considering all tests simultaneously (information that is hidden in technical details of their report). They fail to quote the corresponding significance level, which needs to be on the same basis (combined across all 12 tests), and erroneously claim their power applies at the 5% significance level. While the significance level associated with one CI being entirely below that expected based on extending the time trend is 5% (based on one side of 90% CIs), the significance level associated with any 1 of the 12 CIs (across the 12-month time period) being entirely below that expected is 46% (since the probability of no significant result out of 12 tests=0.9512 =0.54, giving a significance level of 46%). The convoluted nature of the analytical description also obscures whether there were in fact 13 tests rather than 12—nevertheless, the significance level would still be approximately 50% for 13 tests.
For the second paper on adults, the authors also amended the overall definition of significance used to calculate power (ie, they remove December 2012 from their analysis), which avoids overall significance favouring standardised packaging (by the definition that they use to calculate power, based on significance on any one of several tests). Such amendments raise additional concerns about the ad hoc nature of the approach used in these working papers.
A more suitable model would therefore be one in which the potential impact of standardised packaging effect is gradual, increasing linearly to reach the predicted decrease at the end of a given period (figure 1). We have performed a Monte Carlo simulation using the gradual effect model just described and a t test, which is everywhere more powerful than the CI method. The power figures we have obtained are shown in table 1, side by side with the power figures published in the second working paper. One can see that they are quite different. This table emphasises the issues with their claim that “From January 2013 on, even very powerful statistical techniques no longer can pick up any change from the pre-existing trend.” Simply put, the data—although suggestive of a standardised packaging effect—and the method used to analyse it, do not permit any inference one way or the other. The same can be said of their first working paper on children. Although data on uptake of smoking specifically rather than prevalence would allow a stronger research design, a statistical analysis of this gradual effect would be a more appropriate use of the available data.
Both of these working papers have been heavily publicised by the tobacco industry, continuing a tradition of misrepresenting evidence.11 The publication of the adult paper on the website of the University of Zürich was accompanied by a media release issued by the Institute for Policy Evaluation, which had been commissioned by Philip Morris International to carry out the study. This claimed that “The experts found no evidence for a standardized packaging effect on smoking prevalence using standard techniques for statistical analysis, in particular requiring a statistical significance level of 5%, which is the standard in applied research.” Several major tobacco companies have additionally referred to these papers in their response to the UK consultation on standardised packaging, for example, Philip Morris Limited submitted that:
In both studies, using standard techniques for statistical analysis and applying the standard statistical significance level of 5%, the experts found no evidence that “standardised packaging” had had an effect on smoking prevalence among Australians aged 14 to 17 years old (in the case of the March study) or Australians aged 14 and above (in the case of the June study). Kaul and Wolf confirmed that if there had been an effect in reality […] it would have been reflected in the data. According to the study, however, no effect was found.12
The publication of the initial paper on children was in the lead up to the publication of the Chantler review on standardised packaging in England, which, despite this, supported it as a means to reduce smoking. There is a continuing need to guard against such misrepresentations of the evidence as other countries look to Australia to inform their own policies of standardised packaging. Nevertheless, despite such opposition from the tobacco industry the most recent Australian government data show that smoking rates have in fact fallen to a historic low of 12.8%.13 There is also additional positive news in that other countries, including the UK, New Zealand and Ireland, are also now moving towards the introduction of standardised packaging to protect the health of their populations.
Contributors PD analysed the data. AAL and PD produced the first draft of the manuscript with input from HCW, NSH and MM. All authors made significant contributions to the writing of the manuscript and critical interpretation of the data.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement The data from the original papers comes from the Roy Morgan Single Source (Australia) dataset, which is available for a fee. The data used in this manuscript has been ‘reverse engineered’ from published material—details will be posted on the website of PD.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.