Objectives: To examine divergent estimates of smoking prevalence in two random digit dial surveys for the same population. Based upon internal and external reviews of survey procedures, differences in survey introductions (general health versus tobacco specific introduction) and/or differences in the use of filter questions were identified as the most likely explanations. This prompted an experiment designed to investigate these potential sources of measurement error.
Design: A randomised 2 × 2 factorial experiment.
Setting: A random digit dial telephone survey from July to September 2000.
Subjects: 3996 adult Californian respondents.
Main outcome measures: A series of smoking prevalence questions in the context of a tobacco or general health survey.
Results: Logistic regression analyses suggest that, among females, prior knowledge (from the survey introduction) that a survey is concerned with tobacco use may decrease self reported smoking prevalence (approximately 4% absolute prevalence difference). Differences in the use of filter questions resulted in almost no misclassification of respondents.
Conclusions: The tobacco specific survey introduction is causing some smokers to deny their tobacco use. The data suggest that these smokers tend to be women that smoked occasionally. A desire by the participants to minimise their personal time costs or a growing social disapproval of tobacco use in the USA may be contributing to the creation of previously undetected survey artefacts in the measurement of tobacco related behaviours.
Statistics from Altmetric.com
- BRFS, Behavioral Risk Factor Survey
- CASRO, Council of American Survey Research Organizations
- CATS, California Adult Tobacco Survey
- CDHS, California Department of Health Service
- SRG, Survey Research Group
Historically, smoking prevalence surveillance data were used for epidemiological research and to monitor health problems associated with smoking. In addition to surveillance, tobacco control programmes now also use these data to target and evaluate interventions.1,2 Legislative bodies also sometimes consider these data when setting and funding health policy.2 Given the critical public health and policy implications of these data, accurate measurement is an important requirement.
Tobacco prevalence is typically estimated via self reported information collected during survey interviews. Studies of the validity of these self reports suggest that they are generally valid indicators of actual behaviour. A recent meta-analysis of 51 comparisons of self reported behaviour and various biochemical assessments, for example, concluded that self reports are both sensitive (mean 87.5%) and specific (mean 89.2%).3 However, other evidence suggests that self reports of cigarette use may under represent the actual extent of this behaviour.4 It has also been suggested that declining public tolerance of smoking has changed the social acceptability of this behaviour and increased the demand characteristics of the interview situation, thereby decreasing respondent willingness to report smoking3,5 and decreasing the accuracy of survey reports of tobacco use.5,6 An association between survey non-response and tobacco use suggests that low response rates may also reduce the accuracy of data collected by tobacco surveillance systems.7–9 Continued assessments of the accuracy of self reported tobacco use, therefore, remain a necessity. In this paper, we present recent research into potential sources of measurement error in the California tobacco surveillance system that was prompted by divergent estimates from two data collection sources.
Since 1993, Californians’ beliefs, attitudes, and behaviours about smoking have been monitored throughout the year by the California Adult Tobacco Survey (CATS) and the Behavioral Risk Factor Survey (BRFS). The California Department of Health Service’s (CDHS) Tobacco Control Section and the Public Health Institute’s Survey Research Group (SRG) administer the surveys and report smoking prevalence of California adults age 18 years and older. The BRFS contains questions about many areas of public health, including tobacco use, while tobacco use is the sole focus of the CATS. In 1998, the estimated smoking prevalence based on the BRFS was 19.7% (95% confidence interval (CI) 18.5% to 21.0%), compared to an estimate of 17.1% (95% CI 16.0% to 18.3%) for the CATS. Quarterly data revealed that a consistent divergence in smoking prevalence began in the third quarter of 1997 (fig 1) with the CATS estimate always lower than that of the BRFS.
In 1996, the current smoking status question changed in both surveys to match a new question from the Centers for Disease Control and Prevention. The question changed from “Do you smoke cigarettes now?” to “Do you now smoke cigarettes every day, some days, or not at all?”. This alteration led to an increase in smoking status of about 1.0%. Most of the change was due to an increase in the number of occasional smokers identified.10 The quarterly data from the BRFS and the CATS indicated that the CATS produced lower prevalence estimates soon after the question was modified in 1996, and this divergence appeared to be amplified after mid 1997 and was evident among both males and females (data not shown). This definition change appears to have amplified the prevalence difference between the two surveys by directionally misclassifying some occasional smokers.
Several potential sources of cross survey variability, including differences in survey protocols, weighting procedures, and data processing, were considered. From this review, two main hypotheses emerged and several others that were considered unlikely.
Survey introduction: From 1993 to 1996, both the CATS and the BRFS had a general health introduction. Beginning in 1997, the CATS introduction stated that the survey was about tobacco use. The specificity of the introduction may cue respondents to adjust their responses (that is, deny tobacco use) in order to shorten the length of the interview experience. (The exact wording of the introductions can be obtained from the authors.)
Question filters: In the CATS, a filter question asks about smoking experimentation before questions about current cigarette use (that is, “Have you ever tried or experimented with cigarette smoking, even a few puffs?”). Respondents who indicate any history of smoking experimentation then are asked about current smoking. In the BRFS, no filter question is used; all respondents are asked about current cigarette use. We hypothesised that some occasional smokers might misunderstand the CATS question about experimentation and bypass the cigarette use questions, thereby lowering the smoking prevalence estimates for that survey. The screener question also provides an opportunity for self selection out of the study for occasional smokers.
Other possible explanations were:
the placement of the smoking questions varied slightly between the two surveys (14th and 15th for the CATS and between 45th and 60th for the BRFS)
the sampling designs for each survey diverged slightly—known businesses were prescreened out of the CATS sample but not the BRFS sample starting in 1998
differences in demographics because of the data was not weighted by income or education or a computational error in the development of the sample weights
systematic differences in the data editing procedures.
Based upon internal and external reviews of survey procedures, these hypotheses were eliminated as likely explanations of the differential smoking prevalence rates. Hypotheses 1 (differential survey introductions) and/or 2 (differential use of filter questions) were retained as likely explanations.
Hypothesis 1 suggests that because the CATS introduction cues respondents to the survey topic, some smokers may intentionally deny their habit in an effort to minimise the length of the survey interaction or to provide a socially desirable response. The new introductory statement coincided with the divergent prevalence rates. Our second hypothesis was that the filter question regarding experimentation with cigarettes may screen out smokers who either do not feel they ever “experimented” with cigarettes (but who nonetheless use them) or who misunderstand the question.
The effects of differences in introductory statements (hypothesis 1) and the use of a filter question (hypothesis 2) on smoking prevalence estimates were assessed in a randomized 2 × 2 factorial experiment. Table 1 presents the four conditions to which survey respondents were randomly assigned. Specifically, each questionnaire had either a general health introduction or a smoking focused introduction. The general health introduction was used in the BRFS and stated: “We’d like to ask you some questions about heart disease, cancer, diabetes, tobacco products, and other important issues facing Californians today.” The smoking introduction was the one used in the CATS and read: “We’re doing a study of California residents to gather information on people’s beliefs, attitudes, and behaviour about smoking.”
In a similar fashion, one version of the smoking module in each questionnaire began with the question regarding experimentation with smoking; the other version began by immediately asking about cigarette use. The experimentation filter question asked: “Have you ever tried or experimented with cigarette smoking, even a few puffs?” This question was followed by: “Have you smoked at least 100 cigarettes in your entire life?” Respondents who replied affirmatively to the “100 cigarettes” question then were asked: “Do you now smoke cigarettes every day, some days, or not at all?” The other versions began with the “100 cigarettes” question followed by the current smoking status question.
Each experimental condition is referred to as an “arm” of the experiment. Arms are defined as follows: arms 1 and 2 employed the smoking introduction, while arms 3 and 4 employed the general health introduction. When the core smoking questions were reached, arms 1 and 3 led with the experimenter filter question, whereas arms 2 and 4 led with the “100 cigarettes” question. Arm 1 was most similar to the CATS, and arm 4 was most similar to the BRFS. In all versions, the first 26 or 27 questions (depending on the use of the filter question) were from the CATS and were followed by 14 demographic questions and five smoking attitude questions. The surveys with the general health introduction concluded with 17 general health questions about physical health, mental health, insurance coverage, asthma, diabetes, and exercise.
Respondents to arms 1 and 3 who had neither experimented nor smoked 100 cigarettes were classified as “non-smokers”, as were those who said they had smoked 100 cigarettes but currently did not smoke at all. To be consistent with the algorithm used in the CATS survey, respondents who said they never experimented with tobacco also were classified as non-smokers. However, they still were asked the “100 cigarettes” item and about their current smoking status. These data are also reported below.
Respondents interviewed in arms 2 and 4 were classified as smokers in a manner identical to the definition employed in the BRFS—that is, they were identified as smokers if they indicated having smoked at least 100 cigarettes in their lifetimes and were currently smoking either “every day” or “some days”.
These surveys were administered via telephone to statewide samples in California from July to September 2000. The survey sampling methodology was identical to that used in the BRFS and the CATS. The sample size of each arm was approximately 1000. The telephone sample was obtained from sampling company, Genesys, at the beginning of each month as follows: 8000 on 29 June, 8000 on 2 August, and 9200 on 31 August. The response rate for the four arms combined was calculated using the Council of American Survey Research Organizations’ (CASRO) formula11 at 43.7%. This response rate is similar to the BRFS and CATS during this same period and there was no noticeable difference in response rates between any of the arms. A demographic comparison indicates that the populations interviewed using each arm were very similar (table 2) with no statistical differences using a χ2 test. The surveys were conducted in compliance with the standards of the CDHS committee for the protection of human subjects. Respondents had a choice of being interviewed in English or Spanish and 9.2% of the respondents choose Spanish.
For all analyses, the data were weighted to the 1990 California population using sex, six age groups, and four race groups. A separate weight was created for comparisons such that they would be valid across age, race, and sex groups.
Because smoking prevalence differs greatly by ethnic, age, sex, and educational groups,12 a series of logistic regression models were estimated to examine experimental effects, controlling for potential differences in sample composition. Interview language (English v Spanish) was also included as a covariate. Since 9% of respondents did not report income, we included an indicator for these respondents in our models. Model selection was performed using differences in −2*log(likelihood) and model fit. Consequently, we will discuss model fits instead of β coefficient significance testing.13 Statistical analyses and data management were carried out with the SAS system (version 8.0, Cary, North Carolina, USA).
Smoking prevalence estimates are summarised in table 3 by experimental arm, sex, and ethnicity. In general, these data suggest that smoking prevalence rates ranged from 15.3–19.8% across the four experimental groups. When examined by sex, the smoking estimates by experimental condition exhibited a wider range for males, between 16.7–26.9%, compared to females (range 12.6–18.0%). When disaggregated by ethnicity, smoking estimates ranged from 17.0–19.9% for non-Hispanic and, more broadly, from 9.3–19.3% for Hispanic respondents. Three way assessments of experimental arm, sex, and ethnicity suggest that variability by interview condition is greatest for Hispanic males and least for females, regardless of ethnicity.
To examine hypothesis 1, smoking prevalence estimates from arms 1 and 2 were compared with estimates from arms 3 and 4. Overall, no differences were found. Of respondents informed that the survey was about smoking, 17.5% reported current tobacco use; among those informed that the survey was about general health issues, the current smoking rate was 17.1%.
Hypothesis 2 was examined by comparing the smoking prevalence estimates obtained when a filter question was and was not used (arms 1 and 3 v arms 2 and 4). The current prevalence rates for these two conditions were 16.1% and 18.6%, respectively. The difference of 2.8% (95% CI 0.5% to 5.1%) was significant, suggesting that the insertion of a filter question could be a potential source of differences in the estimated prevalence rates.
Logistic regression analysis
To simultaneously evaluate both conditions, examine interactions, and control for demographic characteristics, we next modelled the data via logistic regression. Indicator variables were created for the type of introduction (1 for the smoking introduction and 0 for the general health introduction) and the presence of a filter question (1 for the filter question and 0 without the filter question). An interaction term for the introduction and smoking filter question was created to determine if there were differential effects. In addition, all three way interactions were considered, but only those having an effect are presented. Note that when interactions are present and non-zero, the main effect β coefficients are meaningless.14
In table 4, we present three models. Based on forward logistic regression, we choose the model in the middle as our final model because of the model fit criteria. Reassuringly, the model effects are consistent across these models. Odds ratios are not presented because of the significant two way and three way interactions.
Our final model selection includes the main effects for demographics that are useful in predicting smoking prevalence. They include income level, education level, sex, Hispanic, and taking the survey in Spanish. In addition, the final model selection includes a three way interaction for the introduction, smoking filter question, and sex (table 4).
Income level, educational level, and being of Hispanic origin are clearly good predictors in our sample. This model also shows that respondents who were interviewed in Spanish have a lower than expected probability of reporting smoking than would be expected in comparison with the English speaking respondents. This finding is not surprising given the relationship of acculturation and smoking in the Hispanic population.
The three way interaction fits the high smoking prevalence among men for arm 2, since the model without the interaction must accommodate this differential effect. However, further examination also shows that the two way interaction for the introduction and sex also is important. Women with the smoking introduction do not report smoking as much as women with the general health introduction (table 3).
To examine whether occasional smokers could be opting out of the survey based on the type of introduction or use of a filter question, we examined “every day” and “some days” smoking status for each arm. “Everyday” smoking prevalence is similar in all arms except arm 2, which has a higher reported level. “Some day” smoking prevalence is lower in arm 1 than in arms 2, 3, and 4. This is consistent with the BRFS and the CATS data from 1996 to 1999.
To determine if the filter question was being misunderstood, all respondents in arms 1 and 3 were asked if they had smoked at least 100 cigarettes in their lifetimes. Of the 1991 respondents, nine (0.45%) answered that they had smoked 100 cigarettes but had never experimented with smoking (four in arm 1 and five in arm 3). Upon further investigation, eight of the nine were Hispanic men, and five of these eight had an eighth grade or less education. However, only two of the nine responded that they smoked “every day” or “some days”. Consequently, although some respondents may not understand the experimenter question, this did not contribute to the divergence in prevalence, since only two out of 1991 (0.1%) were misclassified.
A large smoking prevalence increase took place in the last month of the three month period (September), especially among respondents in arm 2. Upon further examination, the increase for September across all the arms was among men of all ethnic groups and all age groups. A simple explanation for this occurrence cannot be made. However, when the month of September is excluded from the logistic regression analyses, the results and conclusions do not substantially change.
Although considerable research currently is being directed at the problem of survey non-response,15 little is known about how respondents may interpret cues to abbreviate the length of their survey participation. Our data suggest that, under some conditions, respondents may indeed provide answers that seem likely to minimise their personal time costs. Krosnick16 has discussed the concept of “satisficing” which suggests that respondents may minimise their cognitive effort during a survey interview by providing acceptable responses rather than expending the energy necessary to produce correct responses. A similar mechanism may underlie the process observed here, as respondents may provide acceptable answers designed to disqualify themselves from more extensive survey participation by denying tobacco use after being informed that the survey is concerned with smoking.
In the simple comparison of the arms, the filter question appears to make a difference in prevalence estimates. However, on follow up with the logistic regression analysis and an examination of respondents who misinterpreted the filter question, the filter question does not cause the prevalence differences that we are observing. By accounting for the high prevalence among men in arm 2 with an interaction term, the logistic regression model suggests that women report lower smoking prevalence when informed about the study’s tobacco focus. This effect is compounded by the larger quantity of occasional smokers among women in the California population.17
Survey refusal rates are typically higher among males.18,19 Females who are reluctant to participate may be less likely to overtly refuse participation but instead opt for other means of limiting the survey interaction. Denying the behaviour of interest during the interview may be one socially acceptable alternative. Social desirability concerns may also contribute to this finding, as females may be particularly susceptible to anti-smoking pressures when provided with advance knowledge regarding the survey’s specific interests. The growing anti-smoking climate in California may have caused this effect to be more pronounced, especially in a population that participates in the behaviour less often as is the case with women who smoke only occasionally. California women have consistently reported more negative attitudes about smoking than men and this general sentiment may influence the willingness of some to admit tobacco use.17 Consequently, the social stigma associated with smoking may influence the quality of self reported surveillance data.
What this paper adds
Numerous studies have shown that self reported tobacco behaviours are valid measures of actual behaviours. Differing smoking prevalence using two different but very similar surveys in California prompted an investigation and subsequently a factorial experiment into the cause of the divergence.
This factorial experiment suggests that women may be influenced in their responses to smoking questions by social stigmatisation. Based on this study, tobacco researchers and evaluators need to maintain rigorously consistent survey methods to assure comparability of results between surveys. Changing survey methods or questionnaires is likely to lead to differences in the estimates of smoking prevalence.
However, a simple interpretation and conclusion cannot be drawn for men in our study (table 3). The difference in results for men and women is not surprising if social stigma is influencing the self reporting of tobacco use based on the argument given above.
The problem of misclassification of smokers needs to be examined as more states begin to supplement their general health surveys, such as the BRFS, with surveys concerned specifically with tobacco use. The misclassification of smokers as non-smokers would affect not only the monitoring of smoking prevalence, which has far reaching political implications, but also estimates of attributable risk. These findings are perhaps an unfortunate but inevitable consequence of the increasing social stigmatisation of tobacco use in the USA and/or need to minimise personal time costs.
Certain limitations exist for this study. First, the research was conducted several years after the smoking prevalence divergence occurred; hence, we are dependent on the current sample being similar to historical and current BRFS and CATS samples to be able to draw conclusions as to the cause of the divergence. Of course, the observed prevalence differences, both in the historical trends depicted in fig 1 and in this experiment, may be attributable to nothing more than sampling error. Although well constructed and executed to detect systematic sources of measurement error in the surveys, this study is not precise enough for us to provide an estimate of the size of the introduction effect on our BRFS and CATS smoking prevalence estimates. Other findings, however, also suggest that growing social disapproval of tobacco use in the USA may be contributing to the creation of previously undetected survey artefacts in the measurement of tobacco use.
The authors would like to thank Dileep G Bal, MD, and William Wright, PhD, for their facilitation of the project, and James Gehrman, PhD, and Doraiswamy Ramachandran, PhD, for the external statistical review of the data.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.