Article Text

other Versions

Twitter analysis of California's failed campaign to raise the state's tobacco tax by popular vote in 2012
  1. Miao Feng1,4,
  2. John P Pierce2,3,
  3. Glen Szczypka4,
  4. Lisa Vera2,
  5. Sherry Emery4
  1. 1Department of Communication, University of Illinois at Chicago, Chicago, Illinois, USA
  2. 2Moores Cancer Center, University of California, San Diego, La Jolla, California, USA
  3. 3Department of Family Medicine and Public Health, University of California San Diego, La Jolla, California, USA
  4. 4Health Media Collaboratory at NORC at the University of Chicago, Chicago, Illinois, USA
  1. Correspondence to Dr John P Pierce, Moores Cancer Center, Division of Population Sciences, University of California, San Diego, 3855 Health Sciences Drive, La Jolla, CA 92037-0901, USA; jppierce{at}


Background The rapid diffusion of social media in the past decade has allowed community members to sway the discourse on elections. We use analyses of social media to provide insight into why the strong public support 1 year prior to the election did not result in an increased tobacco tax from the 2012 California Proposition 29 vote.

Methods Using the Twitter historical Firehose, we chose all tweets on Proposition 29 posted between 1 January and 5 June 2012 differentiating between early and late campaign periods. Tweets were coded for valence, theme and source. We analysed metadata to characterise accounts. Television ratings data in 9 major California media markets were used to show the strength of the 2 campaigns.

Results ‘No on 29’ launched television advertising earlier and with much higher household gross rating points (GRPs) than the ‘Yes on 29’ campaign. Among 17 099 relevant tweets from 8769 unique accounts, 53% supported Proposition 29, 27% opposed and 20% were neutral. Just under half (43%) were from accounts affiliated with the campaigns. Two-thirds of campaign messages originated outside California. The ‘Yes’ campaign focused on simple health messages, which were equally represented in both campaign periods. However, anti-tax tweets increased at relative to pro-tax tweets in the second period.

Conclusions Although the Prop 29 campaigns did not effectively engage the Californian twitter communities, analysis of tweets provided an earlier indication than public polls of the loss of public supporting this election. Prospective Twitter analysis should be added to campaign evaluation strategies.

  • Taxation
  • Public policy
  • Media
  • Tobacco industry

Statistics from


Increasing cigarette taxes is one of the most effective strategies to reduce cigarette smoking,1 a major public health goal.2 Historically, California (2012 population: 38 million) has been a worldwide leader in implementing tobacco control policies and has maintained prices in the top tier among US states.3 California is one of 24 US states that allows citizens to circumvent their representatives to initiate statewide legislation using the ballot process. However, since 1998, California has failed to increase cigarette taxes resulting in cigarette prices that are much lower than most other US states.4 This paper analyses social media data to identify how the pro-excise and anti-excise tax campaigns attempted to influence the electorate in the failed 2012 excise tax initiative—Proposition 29, the California Cancer Research Act of 2012.

The initiative process involves a proposal to the state Attorney General for approval to proceed. Once approved, the government provides an initiative title and prepares a summary and fiscal impact statement. Initiative sponsors then have 180 days to collect support signatures, from at least 5% of voters in the previous gubernatorial election. After signature verification, the initiative receives a proposition number and is included on the next statewide election ballot.

Proposition 29 proposed to increase the state cigarette tax from US$0.87 to US$1 per pack to finance California's research on cancer as well as strengthening the state's smoking prevention and cessation programmes. The petition qualified in August, 20105 and was scheduled for the 5 June 2012 Presidential primary election ballot—an election to select the party Presidential candidates, which, historically, has a lower voter turnout than the November Presidential election. Mass media campaigns for such propositions typically start with persuasive communications to influence the vote choice and then move to ‘Get Out The Vote’ strategies just prior to the election.6 The ‘No on 29’ campaign, well funded by the Tobacco Industry, began mass media advertising on 16 April 2012, 3 weeks before the ‘Yes’ campaign.7 A key strategy of proposition campaigns is to choose persuasive messages in which voters are already heavily invested, ones that will remain ‘sticky’ in the face of aggressive counterarguing.8 One such ‘sticky’ persuasive message, identified in previous successful campaigns, was the use of increased tax revenues to prevent children from starting to smoke.9

Public support for Proposition 29 was initially strong, with polling data in 2010 and 2011 indicating 75% support, although no data were available on the ‘stickiness’ of this support in the presence of counterarguments.7 The ‘Yes on 29’ campaign focused on a simple, somewhat generalised health message, such as ‘Beat Cancer’. The ‘No’ campaign did not challenge this health message, but focused on labelling the proposition as ‘flawed’, as requiring a significant new bureaucracy or as using California money to fund out-of-state scientists.7 The ‘Yes’ campaign quickly began losing public support: by 14–16 May polling showed that support had decreased to the low 50% range, and by the end of May, support was in the 40% range, with some 20% of likely voters still undecided.7 Proposition 29 was defeated by a narrow margin, 50.2% (No) vs 49.8% (Yes).

In recent years, the rapid and popular adoption of social media has revolutionised mass media campaigns by allowing community members to join the public discourse in real time. Following the 2008 Presidential Election, many commentators recognised the potential of social media to play an important role in influencing voter decisions.10 ,11 Social media data allow the identification of the social conversations that influence campaign outcomes.12–14 Many campaigns now use analyses of available social media to assess the level and valence of community engagement, to inform ongoing decision-making for message optimisation.15 Twitter is the platform of choice for studying community engagement with campaigns, as Twitter is widely used for political communication16 and is an open platform with few privacy restrictions. In 2012, the Twitter ‘Firehose’ (real-time tweet data) contained more than 340 million tweets from over 100 million users worldwide.17

In this paper, we analyse tweets that we identified as being associated with Proposition 29 during the period when campaigns used mass media advertising to influence the vote on the proposition. We coded the message content of the identified tweets and investigated how the overall tweet content changed between the early campaign and the later campaign (before vs after 15 May) to identify whether the ‘Yes’ campaign modified its messages in the face of polls suggesting a rapid loss of public support. Finally, as this Proposition was for eligible California voters, it was important that California residents participate in the social media commentary.18 ,19 To assess this engagement, we geocoded the metadata for each tweet to identify the proportion of tweets that might have been generated from within California.


Television advertisements

As in previous research,20 we used television ratings data obtained by licence from Nielsen Media Research via Kantar Media, to measure potential exposure to pro-Prop 29 and anti-Prop 29 television ads targeted to a general household audience in the nine California media markets between January and June 2012. Ratings represent a standard metric for quantifying advertising intensity, reflecting the product of the percentage of the target audience reached multiplied by frequency of exposures. Thus, an ad that aired five times reaching 65% of the audience each time it aired would achieve 325 (5×65%) gross rating points (GRPs). We averaged weekly GRPs across media markets and divided this value by 100 to obtain the average weekly potential exposure to each type of ad.

Twitter data extraction

Twitter data were obtained from Gnip from the historical Firehose, which provides access to the entire corpus of tweets with associated metadata during a given time period.21–23 There were several hashtags associated with the campaigns, such as #prop29, #yeson29 and #beatcancer. We included all campaign-related hashtags to our filter rules. Since many Twitter users do not use hashtags in their messages, we also extracted data using the following nine keyword filters: ‘California Cancer Research Act’; ‘Californians Care’; ‘California Tax (es)’; ‘CA Tax (es)’; ‘Californians for a Cure’; ‘prop29’; ‘Proposition 29’; ‘@prop29’ and ‘LaDonna Porter’ (the tobacco industry spokesperson in a key television ad on Prop 29 who was the focus of many tweets). This resulted in 115 619 potentially relevant tweets and retweets over the study period.

Twitter data preparation/cleaning

We restricted our analytic data set to tweets in English. To assess the quality of our keywords, we stratified our retrieved archive by keyword and randomly sampled 200 tweets from each keyword data set. Using two coders (intercoder reliability: κ>0.9), we assessed precision of the retrieved tweets (ie, the proportion of tweets that were relevant to Proposition 29), using 80% precision as the criterion for accepting tweets from each keyword. Two keywords did not meet this criterion, both focused on California taxes. After applying additional keyword filters (‘tobacco’ and ‘cigarette’ plus ‘taxes’) in these data sets, they were much reduced in size and met the criterion for acceptance. Further cleaning removed 7333 non-sensical tweets, resulting in an analytic data set of 17 099 tweets, posted by 8769 unique user accounts over the 9 weeks prior to the vote.

Data analysis

We used human coders to code each tweet for valence and theme and to categorise the source of each tweet by the type of user account. We allowed valence to be positive, negative or neutral towards Proposition 29. To define themes, following previous practice,21 ,22 we sampled a random 600 tweets from the cleaned data set. Using a peer-reviewed framework,24 two experienced coders identified four mutually exclusive themes: (1) cigarette tax (subthemes were pro-tax and anti-tax); (b) health; (c) voting information (such as poll updates and neutral facts) and (d) public responses (eg, public reactions such as confusion, doubts and questions about Proposition 29 and/or mobilisation to vote).

We categorised the user accounts as either commercial or personal. Commercial accounts included any username/bio that represented an organisation (ie, related to either side of campaign such as @USCHealthNews), as opposed to personal accounts that appeared unaffiliated. Personal accounts were further categorised into ‘influencers’ and ‘organic’ accounts. We operationalised ‘influencer’ as an account with more than 320 followers (ie, twice the median number of followers among accounts in the clean analytic data set). Organic users represented non-commercial accounts22 with 320 or fewer followers. Finally, we included a source category that we labelled ‘others’, representing accounts no longer active when we did the coding in July 2015. Within the sample, intercoder reliability (Krippendorff's α) for coding the content valence and themes, and tweet source was 0.89 and 0.91, respectively. One researcher (MF) coded each tweet and each user's profile for location information.25


Timing of television ratings and tweets

Figure 1 presents the media GRPs in California media markets and relevant number of tweets for the ‘Yes on 29’ and ‘No on 2’ campaigns for 9 weeks prior to election day. The ‘No’ campaign outperformed the ‘Yes’ campaign on GRPs throughout the 9 weeks. While the ‘No’ campaign had GRPs in each of these weeks, the ‘Yes’ campaign started later in limited markets on 8 May 2012. The ‘Yes’ campaign had two peak weeks from 20 May to 2 June during which GRPs were over 170 per week; however, even during these peak weeks, the GRPs of the ‘No’ campaign were more than double those of the ‘Yes’ campaign.

Figure 1

Timing of TV ratings and tweets preceding election. GRPs, gross rating points.

On average, the daily tweet frequency did not exceed 400 until the later campaign period. As both campaigns moved into full ‘Get Out The Vote’ mode in the final week before the election, daily tweet frequency increased dramatically, from about 700 tweets 3 and 4 days before the election, to about 1400 tweets 2 days prior, to over 2000 on the day before the election. Throughout the 9 weeks, tweet frequency was associated with the amount of the mass media purchased in the same week (Spearman's correlation r=0.72, p<0.001).

Tweet valence and themes

Overall, 53% (n=9059) of the relevant tweets were supportive of Proposition 29, about one-quarter were opposed (n=4532, 27%), and 20% (n=3508) were neutral (table 1). Just over 43% (n=7393) were generated from accounts identified with the campaigns, 19% were categorised as from influential accounts and 37% (n=6401) were from non-influencers or organic accounts. In early campaign period when the ‘Yes on 29’ television campaign had little exposure, 6383 tweets were posted, of which 62% (n=3957) were supportive of Proposition 29. Considering only the organic tweets, 72% were supportive during this period compared to only 56% in the later campaign period (p<0.0001). Conversely, 16% of organic tweets were opposed to Proposition 29 in the early campaign period compared to 26% in later campaign (p<0.0001). While tweets from influencers in the early period were twice as likely to be supportive of the ‘Yes’ campaign, in the later campaign supportive tweets declined significantly (57.8% to 42.9%, p<0.0001) accompanied by an increase in oppositional tweets (27.6% to 35.4%). Tweets from commercial accounts followed a similar pattern: supportive tweets declined from 56.2% to 42.5% (p<0.0001) while oppositional tweets increased from 24% to 30.4% (p<0.0001).

Table 1

Source and valence of tweets before and after 15 May 2012

Distribution of persuasive messages in tweets

The distribution of tweets by persuasive theme (health, pro-tax, anti-tax) is presented for both study periods by the type of user account (figure 2). The proportion of commercial, influencers and organic tweets with a health theme stayed relatively constant before and after 15 May (commercial: 40% vs 39%; influencer: 37% vs 38%; organic: 59% vs 54%). The proportion of pro-tax tweets declined significantly in the later campaign period (commercial: 22% to 13%, p<0.0001; influencers: 22% to 11%, p<0.0001; organic: 19% to 11%, p<0.0001). Conversely, tweets with an anti-tax theme increased significantly in the later period (commercial: 39% to 48%, p<0.0001; influencers: 41% to 51%, p<0.0001; organic: 22% to 34%, p<0.0001).

Figure 2

Distribution of persuasive messages.

Within the cigarette tax category, pro-tax and anti-tax tweets often categorised the issue in radical terms such as ‘a furious fight’, ‘political battle’ and ‘war’. Many of the anti-tax tweets echoed messages from the ‘No on 29’ television campaign, describing the proposition as ‘flawed’, building a bureaucracy and sending money out of California (one illustrative example: “A Non-smoker's Opposition to #Prop 29 A new tax for a new bureaucracy”). Others were more generically against any taxes: “No new taxes. Stop the tax-happy left from preventing YOUR individual choice”.

Pro-tax tweets emphasised the low current tax on tobacco in California such as “As of 1/1/12, #California's #cigarette #tax on was 18th lowest in the U.S. #Prop29 #QuitSmoking #cancer” and “RT @GavinNewsom: CA hasn't raised cigarette tax in 14 yrs. Prop29 would raise it $1. Vote tomorrow #yeson29 @prop29 @lancearm”. Many tweets also mentioned reducing illness and death as a benefit of increasing revenue, the primary message from the ‘Yes’ campaign.

The majority of tweets supportive of the ‘Yes on 29’ campaign were categorised as health-related messages, with many using versions of the generic message ‘raise the tax to beat cancer’. The ‘Yes’ campaign did not appear to challenge the counterarguments of the ‘No’ campaign. This appeared to be a mistake as it left supporters to try to counter a perceived strong ‘No’ message using only a tweet, for example, “@markhud New ‘bureau’=nine people. Someone's gotta decide how to fund cancer research. Would you prefer scientists, or politicians?”

Finally, there was little evidence that the ‘Yes’ campaign delivered the stronger and ‘stickier’ message of preventing adolescents from starting to smoke. Less than 4% (n=225) of the Twitter messages mentioned teens as a prime reason for supporting the proposition; and when they did, it was usually part of a complicated multicomponent message such as “Tobacco costs U.S. $193 mil/yr in health/productivity. Every day, 3800 kids start smoking and 1200 people die”.

Geolocation of Twitter accounts

Even though the science for geocoding tweets is still developing,26 ,27 we argue that, in a statewide election campaign, tweets from account holders from within the state are much more important than those from outside the state. Unfortunately, we searched for these data well after the completion of the campaign (July 2015). By then, 1024 accounts posting relevant tweets during the campaign were no longer accessible, suggesting accounts specifically created for the campaign rather than activating regular twitter users. These were not equally distributed across campaigns; two-thirds of these defunct accounts had been supportive of Proposition 29.

During the early campaign, about half of the commercial users could be identified as originating from California (figure 3), whereas during the later campaign period, two-thirds of tweets with health or anti-tax themes were identified as coming from outside the state. For influencers and organic accounts, almost two-thirds appeared to be tweeted from outside California in both periods. For the commercial accounts, in the early campaign period approximately half the tweets appeared to be from accounts within California. One-third of all tweets during both study periods were affiliated with the two major donors to the ‘Yes on 29’ campaign: Lance Armstrong (@lancearmstrong: n=2291; @livestrong, n=1836; @livestrongceo, n=805; @livestrong, n=177; @teamlivestrong, n=40) and Mike Bloomberg (@mikebloomberg, n=901). Neither has headquarters in California.

Figure 3

Proportion of tweets from inside California.


On two occasions in the past 17 years, California voters have failed to pass propositions targeting an increase in cigarette tax for public health purposes. Although in each case the opposition campaign was much better funded (2012: No campaign US$46.8m vs Yes campaign US$12.3m),28 there are many examples where a massive advantage in campaign financing has been insufficient to ensure an election win (eg, Jerry Brown vs Meg Whitman, California Gubernatorial campaign).29 The rapid diffusion of social media platforms has allowed community members to join the public discourse in real time, while providing researchers with a new tool to analyse campaign messaging strategy and effectiveness.

To identify whether Proposition 29 tweets were part of the campaign discourse, we hypothesised that there needed to be a significant correlation between the timing of tweets about Proposition 29 and advertising expenditures for each campaign, and we found such an association. This supports the hypothesis that tweets can extend the reach of television advertising with a ‘ripple effect’.21 However, this is not the only effective use of tweets. We divided the campaign into two time periods, and the ‘Yes on 29’ campaign had little television exposure during the early campaign period. Nevertheless, Twitter users during this period were predominantly supportive of the ‘Yes’ campaign, apparently using in an attempt to counter advertising messages from the ‘No’ campaign.

Content analysis of the Proposition 29-relevant tweets identified three general themes related to persuasive messages used by the campaigns. We compared the frequency of tweets on these themes across two campaign-related time periods and identified significant changes: pro-tax tweets decreased and anti-tax tweets increased in the second period before the election-day vote. Messages supporting Proposition 29 in the later campaign period became more focused on a health theme—a theme not even contested by the ‘No’ campaign. We found no evidence that tweets in support of the ‘Yes’ campaign evolved over the campaign period, as might have been expected from a campaign that was steadily losing public support. Simple exhortation messages such as ‘Californians—Vote YES on Prop 29!’ or ‘Yes on Proposition 29!’ assumed that the case for the tax was indisputable and that all that was needed was to ensure that people who supported the proposition would turn out to vote. However, tweets in which obvious supporters struggled to refute the ‘No’ counterarguments showed that such an assumption was short-sighted. No messages from the commercial accounts affiliated with the ‘Yes on 29’ campaign addressed such issues. Previous campaigns30 suggested that preventing children from starting to smoke was so important to the potential electorate that focusing on this message might have been ‘sticky’ and maintained support even through the ‘No on 29’ counterarguments.

Even though the majority of funding for the ‘Yes’ campaign came from out of state, campaigns for an election-day vote must engage the local community. Our analyses suggest that almost two-thirds of the Twitter messages related to this campaign came from outside California, suggesting that neither campaign was very effective at proselytising California voters on the issue. In this study, we analysed users' profiles, which reflect their primary location rather than the location at the time of tweeting. Although imperfect, we believe that account holders from California are more likely to be Californians than account holders from elsewhere. During the signature gathering phase of the proposition, the ‘Yes on 29’ campaign appeared to understand the importance of social media when it identified 38 000 signatories who indicated that they were prepared to use their social media accounts to help the campaign.7 However, over the 9 weeks of the campaign there were only 17 099 tweets—half the number of signatories. This suggests that the campaign did not follow through with this key group of volunteers. Further, our definition of an influencer was exceedingly liberal. Indeed, there was a lack of true influencers from California posting relevant tweets on the proposition, although there are known ways to co-opt such people to assist with the campaign.31 ,32 Using tweets as a yardstick, it would appear that neither campaign was very effective at engaging relevant communities within the state.

Strengths of this study are the use of the Twitter Firehose, which contains a census of tweets sent on a social media platform that has been associated with political discourse. Using keywords, we were able to identify tweets that were clearly relevant to the campaign. The multiple-keyword data extraction technique captured a more comprehensive set of tweets than would have occurred using hashtag-specified data collection. This enabled us to differentiate between campaign affiliates and the social conversation among people who were interested in the campaign. Compared to other campaigns,33 the Twitter engagement for this initiative was quite low. Perhaps if the mass media campaigns had promoted social media discussion usage would have been much higher. A limitation of this study is that Twitter was not the major social media player in the population at the time (it was much smaller than Facebook with less diverse set of users), however Facebook data have a privacy policy that does not provide data for an analysis such as this.


Data for this study were collected in 2012 when social media were still disseminating at a rapid rate across the US population. However, even at this early stage of social media adoption, our analysis suggests that the inability of the ‘Yes on 29’ campaign to effectively counter the arguments of the ‘No on 29’ campaign was associated with the inexorable slide in popular support that occurred throughout the 9-week campaign. These data make the case for including Twitter analysis into future public health campaign evaluations.

What this paper adds

  • The diffusion of social media has allowed the public to comment on public health campaigns.

  • Analysis of campaign-related tweets can demonstrate the level of engagement of the community and should be included in campaign evaluations.

  • The general health message chosen to lead the ‘Yes on Prop 29’ campaign to increase tobacco tax was insufficient to withstand aggressive counterarguments of the ‘No’ campaign.

  • A major failure was that the ‘Yes’ campaign did not adapt its message strategy even with evidence that they were losing public support.


The authors thank Sheila Kealey for her technical help in editing and manuscript preparation.


View Abstract


  • Twitter Follow Glen Szczypka at @glenszczypka and Lisa Vera at @lisavera57

  • Contributors MF contributed to analysis, writing, reviewing and editing of the manuscript. JPP and SE contributed to concept development, obtaining funding, writing, reviewing and editing of the manuscript. GS contributed to data preparation. LV contributed to writing, reviewing and editing of the manuscript.

  • Funding This research was supported by the California Tobacco-Related Disease Research Program (TRDRP grant number 22RT-0144) and by the National Institutes of Health (National Cancer Institute grant number 5 U01 CA154254).

  • Disclaimer The opinions expressed here are those of the authors, and do not necessarily reflect those of the sponsors.

  • Competing interests None declared.

  • Ethics approval University of California, Human Research Protections Program Office.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.