Article Text

Download PDFPDF

Big tobacco focuses on the facts to hide the truth: an algorithmic exploration of courtroom tropes and taboos
  1. Stephan Risi1,2,
  2. Robert N Proctor1
  1. 1 History, Stanford University, Stanford, California, USA
  2. 2 Programs in the Digital Humanities, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
  1. Correspondence to Stephan Risi, History, Stanford University, Stanford, CA 94305, USA; risi{at}


Objective To use methods from computational linguistics to identify differences in the rhetorical strategies deployed by defence versus plaintiffs’ lawyers in cigarette litigation.

Methods From 318 closing arguments in 159 Engle progeny trials (2008–2016) archived in the Truth Tobacco Industry Documents, we calculated frequency scores and Mann-Whitney Rho scores of plaintiffs versus defence corpora to discover ‘tropes’ (terms used disproportionately by one side) and ‘taboos’ (terms scrupulously avoided by one side or the other).

Results Defence attorneys seek to place the smoker on trial, using his or her friends and family members to demonstrate that he or she must have been fully aware of the harms caused by smoking. We show that ‘free choice,’ ‘common knowledge’ and ‘personal responsibility’ remain key strategies in cigarette litigation, but algorithmic analysis allows us to understand how such strategies can be deployed without actually using these expressions. Industry attorneys rarely mention personal responsibility, for example, but invoke that concept indirectly, by talking about ‘decisions’ made by the individual smoker and ‘risks’ they assumed.

Conclusions Quantitative analysis can reveal heretofore hidden patterns in courtroom rhetoric, including the weaponisation of pronouns and the systematic avoidance of certain terms, such as ‘profits’ or ‘customer.’ While cigarette makers use words that focus on the individual smoker, attorneys for the plaintiffs refocus agency onto the industry. We show how even seemingly trivial parts of speech—like pronouns—along with references to family members or words like ‘truth’ and ‘facts’ have been weaponised for use in litigation.

  • litigation
  • tobacco industry
  • tobacco industry documents

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from


  • Contributors Both authors contributed by conceptualising the project and writing and revising the report. Stephan Risi wrote the code and accompanying website for the project.

  • Funding This work was supported by the State of California’s Tobacco-Related Disease Research Program (TRDRP) high impact pilot award “Fighting Big Tobacco with Big Data,” award number 25IP-0017.

  • Competing interests RNP has served as an expert witness for plaintiffs in cigarette litigation.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available in a public, open access repository.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.