Article Text

Download PDFPDF
Mining social media data for opinion polarities about electronic cigarettes
  1. Hongying Dai1,2,3,
  2. Jianqiang Hao4
  1. 1Health Services & Outcomes Research, Children's Mercy Hospital, Kansas City, Missouri, USA
  2. 2Department of Biomedical & Health Informatics, University of Missouri-Kansas City, Kansas City, Missouri, USA
  3. 3Department of Pediatrics, University of Missouri-Kansas City, Kansas City, Missouri, USA
  4. 4Bellevue University, Omaha, Nebraska, USA
  1. Correspondence to Dr Hongying Dai, Health Services & Outcomes Research, Children's Mercy Hospital, Kansas City 64108, MO, USA; hdai{at}


Background There is an ongoing debate about harm and benefit of e-cigarettes, usage of which has rapidly increased in recent years. By separating non-commercial (organic) tweets from commercial tweets, we seek to evaluate the general public's attitudes towards e-cigarettes.

Methods We collected tweets containing the words ‘e-cig’, ‘e-cigarette’, ‘e-liquid’, ‘vape’, ‘vaping’, ‘vapor’ and ‘vaporizer’ from 23 July to 14 October 2015 (n=757 167). A multilabel Naïve Bayes model was constructed to classify tweets into 5 polarities (against, support, neutral, commercial, irrelevant). We further analysed the prevalence of e-cigarette tweets, geographic variations in these tweets and the impact of socioeconomic factors on the public attitudes towards e-cigarettes.

Results Opinions from organic tweets about e-cigarettes were mixed (against 17.7%, support 10.8% and neutral 19.4%). The organic—against tweets delivered strong educational information about the risks of e-cigarette use and advocated for the general public, especially youth, to stop vaping. However, the organic—against tweets were outnumbered by commercial tweets and organic—support tweets by a ratio of over 1 to 3. Higher prevalence of organic tweets was associated with states with higher education rates (r=0.60, p<0.0001), higher percentage of black and African-American population (r=0.34, p=0.01), and higher median household income (r=0.33, p=0.02). The support rates for e-cigarettes were associated with states with fewer persons under 18 years old (r=−0.33, p=0.02) and a higher percentage of female population (r=0.3, p=0.02).

Conclusions The organic—against tweets raised public awareness of potential health risks and could aid in preventing non-smokers, adolescents and young adults from using e-cigarettes. Opinion polarities about e-cigarettes from social networks could be highly influential to the general public, especially youth. Further educational campaigns should include measuring their effectiveness.

  • Media
  • Social marketing
  • Socioeconomic status

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.