Article Text

Download PDFPDF
Understanding e-cigarette content and promotion on YouTube through machine learning


Introduction YouTube is a popular social media used by youth and has electronic cigarette (e-cigarette) content. We used machine learning to identify the content of e-cigarette videos, featured e-cigarette products, video uploaders, and marketing and sales of e-cigarette products.

Methods We identified e-cigarette content using 18 search terms (eg, e-cig) using fictitious youth viewer profiles and predicted four models using the metadata as the input to supervised machine learning: (1) video themes, (2) featured e-cigarette products, (3) channel type (ie, video uploaders) and (4) discount/sales. We assessed the association between engagement data and the four models.

Results 3830 English videos were included in the supervised machine learning. The most common video theme was ‘product review’ (48.9%), followed by ‘instruction’ (eg, ‘how to’ use/modify e-cigarettes; 17.3%); diverse e-cigarette products were featured; ‘vape enthusiasts’ most frequently posted e-cigarette videos (54.0%), followed by retailers (20.3%); 43.2% of videos had discount/sales of e-cigarettes; and the most common sales strategy was external links for purchasing (34.1%). ‘Vape trick’ was the least common theme but had the highest engagement (eg, >2 million views). ‘Cannabis’ (53.9%) and ‘instruction’ (49.9%) themes were more likely to have external links for purchasing (p<0.001). The four models achieved an F1 score (a measure of model accuracy) of up to 0.87.

Discussion Our findings indicate that on YouTube videos accessible to youth, a variety of e-cigarette products are featured through diverse videos themes, with discount/sales. The findings highlight the need to regulate the promotion of e-cigarettes on social media platforms.

  • Advertising and Promotion
  • Electronic nicotine delivery devices
  • Media
  • Social marketing
  • Surveillance and monitoring

Data availability statement

Data are available upon reasonable request. Data used from this study are publicly available data from YouTube. However, we can provide data upon reasonable request.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.