Article Text

Methods of the International Tobacco Control (ITC) China Survey
  1. Changbao Wu1,
  2. Mary E Thompson1,
  3. Geoffrey T Fong1,
  4. Qiang Li1,
  5. Yuan Jiang2,
  6. Yan Yang2,
  7. Guoze Feng2
  1. 1University of Waterloo, Waterloo, Ontario, Canada
  2. 2Chinese Center for Disease Control and Prevention, Beijing, China
  1. Correspondence to Professor Changbao Wu, Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario Canada N2L 3G1; cbwu{at}uwaterloo.ca

Abstract

This paper describes the design features, data collection methods and analytical strategies of the ITC China Survey, a prospective cohort study of 800 adult smokers and 200 adult non-smokers in each of six cities in China . In addition to features and methods which are common to ITC surveys in other countries, the ITC China Survey possesses unique features in frame construction, a large first phase data enumeration and sampling selection; and it uses special techniques and measures in training, field work organisation and quality control. It also faces technical challenges in sample selection and weight calculation when some selected upper level clusters need to be replaced by new ones owing to massive relocation exercises within the cities.

  • Survey methods
  • longitudinal studies
  • questionnaire design
  • quality control
  • data collection
  • advocacy

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

The International Tobacco Control (ITC) Policy Evaluation Project was created in 2002. It was conceived as a research tool to measure the effectiveness of national-level tobacco control policies in selected countries which signed and ratified the Framework Convention on Tobacco Control (FCTC). The ITC project possesses several unique features that set it apart among studies on tobacco control. It was designed based on a conceptual model which assumes that each tobacco control policy ultimately has an influence on behaviour through a specific causal chain of psychological events.1 While the formulation and inclusion of survey questions (variables) are guided by the assumed conceptual model and the provisions of the FCTC, two other key features of the study are the longitudinal and international aspects of its design. The longitudinal data structure allows the psychosocial and behavioural changes before and after the implementation of a particular tobacco control policy in a country to be measured and compared; and the use of same model and tools in different countries permits one or more countries to be used as control groups when cross-country comparisons are made, and the impact of culture, geographical and economical differences on the effectiveness of certain tobacco control policies is studied.

The ITC survey first started in four large English speaking countries—namely, Canada, USA, Australia and the UK (the ITC-4 Survey). It is a random digit dialled telephone survey of over 2000 adult smokers in each of the four countries. The first wave of the survey was conducted in 2002. In subsequent waves, the initial group of respondents was followed and a new cross-sectional replenishment sample was added to make up for the reduced size of the longitudinal sample owing to attrition. The dual design (longitudinal and cross-sectional) is another important feature of the ITC survey, which allows the examination of the effects of attrition and time-in-sample. Thompson et al2 contains details on the features, data collection methods and statistical methods for the ITC-4 Survey.

The ITC project has been growing steadily, with many countries of geographical and strategic importance being added to the initial ITC-4 Survey. Among the significant expansions was the launch of the ITC China Survey in 2006. The ITC China Survey is a prospective cohort study of 800 adult smokers and 200 adult non-smokers in each of six cities in China: Beijing, Shanghai, Guangzhou, Shenyang, Changsha and Yinchuan.i In addition to features and methods which are common to ITC surveys in other countries, the ITC China Survey possesses unique features in frame construction, a large first phase data enumeration and sampling selection. It uses special techniques and measures in training, fieldwork organisation and quality control. It also faces technical challenges in sample selection and weight calculation when some selected upper level clusters need to be replaced by new ones owing to massive relocation exercises within the cities, as occurred with two of the cities at Wave 1 and Wave 2.

This paper describes methods used in the ITC China Survey. Special attention is given to design features, training, fieldwork organisation and quality control measures. Additional details are provided in the ITC China Survey Wave 1 Technical Report, which can be found at http://www.itcproject.org.

Design features

It was clear at the beginning of the planning stage that a national representative sample was not feasible, and that the survey would have to be carried out through face-to-face interviews. First, any attempt to cover the vast rural areas in China would require tremendous resources and staff levels, and the ITC China project is clearly not equipped to achieve that goal. Second, most Chinese people are not used to accepting long interviews by telephone. Given the complexity, the sophistication and the longitudinal nature of the ITC survey, it was decided that the survey should be conducted in selected cities through face-to-face interviews. Another important consideration was that any tobacco control policy to be implemented by the Chinese government will probably first start in major cities. A prominent example is the introduction of new regulations and restrictions on smoking in public venues in Beijing, put in place prior to the Beijing Olympics in the summer of 2008.

The target population

The six cities in the ITC China survey do not constitute a random sample of the entire population of China. They were judiciously selected based on geographical representations and levels of economic development. Beijing, Shanghai and Guangzhou are the three largest cities in the north, east and south of China, and these three cities are all in the forefront of China's economic development in recent years. Shenyang is the largest city in the north east. Changsha is a mid-sized city in the southern central part of China and is also one of the major bases for the Chinese tobacco industry. Yinchuan is an economically less developed city in the northwest region.

The mobile population in these cities are not eligible for the study owing to the requirement of follow-ups in subsequent years. The well established city registration system for permanent residents makes the exclusion an easy task to execute. The target population of the ITC China Survey consists of smokers and non-smokers who are 18 years or older and are permanent residents and live in residential buildings in each of the six cities. Smokers are defined as those who have smoked at least 100 cigarettes in their lifetime and are currently smoking at least once a week. Ex-smokers are not considered as a separate category at Wave 1 ITC China Survey.

Sample size

The overall sample size of the survey is 4800 for adult smokers and 1200 for adult non-smokers for the baseline Wave 1, with 800 smokers and 200 non-smokers surveyed in each of the six cities. This choice of sample sizes was based not primarily on power calculations but rather on a practical allocation of available resources. However, the sample size for smokers is large enough not only to obtain reliable statistics at the aggregated level but also to have meaningful estimates for each city. The sample of non-smokers with smaller sizes is constrained by the available resources but it nonetheless provides opportunities to examine differences in some of the key psychosocial and behavioural measures between smokers and non-smokers. At subsequent waves replenishment samples of smokers as well as non-smokers are added to compensate for the losses to follow-up owing to attrition in the longitudinal sample.

Frame construction and sample selection

The ITC China Survey employs a stratified multistage cluster sampling design. Each city is treated as a stratum and within each city, there is a natural and well established hierarchical administrative system which provides excellent coverage of the target population:

City→ street district (Jie Dao)→ residential block (Ju Wei Hui)→ household

The Jie Dao and Ju Wei Hui are two levels of administrative units under the city government. More importantly, the ITC China team has strong communication links with the Jie Dao and Ju Wei Hui staff members, who play crucial roles in the first phase data enumeration as well as coordination for the survey interview.

In each of the six cities, 10 Jie Dao were randomly selected, with probability of selection proportional to the population size of the Jie Dao. Within each of the 10 sampled Jie Dao, two Ju Wei Hui were selected, again with probability proportional to the population size of the Ju Wei Hui. The randomised systematic PPS sampling method was used to select the Jie Dao and Ju Wei Hui. Within each selected Ju Wei Hui, a complete list of addresses of the dwelling units (households) was first compiled from administrative data, and then a sample of 300 households was drawn from the list by simple random sampling without replacement. In this way, the second phase sampling frame of 6000 households was constructed in each city, and the frame itself can be viewed as a first phase sample from the city population. The use of PPS sampling at each of the first two stages (Jie Dao and Ju Wei Hui), and a simple random sample of an equal number (300) of households in each selected Ju Wei Hui, ensured that each eligible household in the city had approximately the same chance of being included in the frame of 6000.

A complete enumeration of the 6000 households was conducted prior to the selection of individuals. In the process, information on age, gender and smoking status for all adults living in these households was collected. The enumerated 300 households within each Ju Wei Hui were randomly ordered, and adult smokers and non-smokers were then approached following the randomised order until 40 adult smokers and 10 adult non-smokers were surveyed. Because of low smoking prevalence among women, one male smoker and one female smoker from each selected household were surveyed whenever possible to increase the sample size for women smokers. At most one non-smoker was interviewed per household. Where there was more than one person in a sampling category to choose from in a household, the next birthday method was used to select the individual to be interviewed, and the selection was done prior to the household visit. Proxy interviews were not allowed in the ITC China Survey.

In order to deal with the potential impact of attrition in this cohort survey, at each subsequent wave, those respondents from the previous wave who are lost to attrition are to be replaced (ie, the cohort is to be replenished) by extending the sampling procedure using the same sampling frame that has been constructed at Wave 1. The way that the initial sampling frame was constructed allows this to be a practical possibility. The Wave 2 replenishment survey, for example, drew its sample from the same list of 300 enumerated households that was constructed in the Wave 1 survey for each Ju Wei Hui; households that were not surveyed in Wave 1 were randomly ordered, and adult smokers and non-smokers were recruited in accordance with the procedures described above for Wave 1. If the list of 300 households was exhausted before the desired quota was reached, available households from an adjacent Ju Wei Hui were used to fill the quota. In Wave 2, this happened four times in Shanghai, three times in Changsha and not at all in the other four cities. In Shenyang, there was a massive loss of Wave 1 respondents within one Jie Dao because they were living in an area where all of the residents were moved under the city's relocation exercise. They could not be contacted at Wave 2. To compensate for this dramatic and unforeseen loss, an entire new Jie Dao was selected in that city, following the procedures that had been used to construct the sampling frame for Wave 1; the 300 enumerated households thus constituted the sampling frame for the Wave 2 replenishment survey in the new Jie Dao, and sampling proceeded as above. In Guangzhou, a similar scenario occurred for one Ju Wei Hui, and a new Ju Wei Hui within the same Jie Dao was added to the Wave 2 replenishment survey. The impact of substituting an upper level cluster on the inclusion probabilities of the resulting sampling design under an initial multistage PPS sampling scheme is further discussed in the section on statistical methods.

The stratified multistage cluster sampling design used for the ITC China Survey is very attractive in terms of frame construction and coverage properties. This type of design is generally popular and efficient for large-scale population surveys and was well documented by Kish3 and Lohr.4 There exist several PPS sampling procedures in the survey literature, and the one used for selecting the first stage clusters Jie Dao and second stage clusters Ju Wei Hui in the ITC China Survey was the randomised systematic PPS sampling method. The procedure was first described in Goodman and Kish5 as a controlled selection method, and was later refined by Hartley and Rao.6 It is the simplest procedure to implement among alternative PPS sampling methods.

The 10 selected Jie Dao in each city comprise the first stage sample of clusters. The sampling fractions of Jie Dao in the six cities are given in table 1.

Table 1

Sampling fractions (f=n/N) of Jie Dao in the six cities

The next-birthday method was used to select a respondent where there was more than one person in a sampling category to choose from in a household. Two other existing methods for selecting individuals within a household are the Kish method and the last-birthday method. Binson et al7 compared the effectiveness of the three methods using data from a national telephone survey and showed that the next-birthday method had a higher rate of retaining respondents in subsequent waves, although the differences between the last-birthday method and the next-birthday method are not statistically significant. Cooperation rates and response rates of Wave 1 ITC China Survey data will be given in the section on sample data.

Survey measures and questionnaire development

The ITC China Survey, as with each ITC Survey being conducted across 20 countries of the ITC Project (at the time of this writing), was designed to measure (1) important smoking and smoking-related behaviours; (2) important psychosocial precursors to smoking and to cessation (eg, intention to quit smoking, self-efficacy for quitting, beliefs about smoking and about quitting, perceived risk, societal and subjective norms, attitudes, denormalisation beliefs); (3) important policy-relevant measures for each of the demand reduction policy domains of the FCTC, including those relevant to health warnings (eg, salience, perceived effectiveness, behaviours relating to reactions to the warnings such as forgoing a cigarette because of the warnings), advertising/promotion (overall salience of pro-tobacco messages and anti-tobacco messages, noticing of tobacco sponsorships), purchasing and price-relevant behaviour, smoke-free laws, cessation, education. The survey also included key psychosocial mediators and (possible) moderators (eg, time perspective, depression) of policy impact.

The development of the ITC China Survey was driven strongly by ITC surveys conducted in other countries, in keeping with the ITC Project's objective of conducting surveys with common measures across the 20 countries. We created the ITC China Survey through a collaborative team effort that involved (1) extensive email exchanges and conference calls between our ITC Project Team centered at the University of Waterloo (and including ITC team members from Roswell Park Cancer Institute), (2) a three-day meeting held at the University of Waterloo with our China National CDC research team, (3) a three-day meeting held two weeks later in Beijing with five ITC team members attending along with the China National CDC research team and the entire research team of 15 CDC officials and researchers across each of the participating China cities, (4) follow-up conference calls and email exchanges to resolve remaining issues. The result was an ITC China Survey in which most of the measures were either identical or, given linguistic and cultural groups existing in China, as functionally similar as possible, to those included in ITC surveys in other countries, but which also included some questions and question options that were unique to China, in accordance with the China team's expertise and experience in tobacco use in China. The ITC China Survey was constructed originally in English, but then was translated into Chinese through a system of multiple translators and with discussion of differences and resolution of those differences.

Despite the extensive collaborative process that we used to create the ITC China Survey—including both the identification of important China-specific factors by the China CDC team (from the China National CDC and from each of the local CDC offices)—and a multistage collaborative translation process, it may be the case that the ITC China Survey may fall short in failing to measure important constructs. Nonetheless, we believe that the resulting ITC China Survey represents a reasonable attempt, given the time constraints, to measure key constructs that are relevant in describing smoking behaviour and in measuring, predicting, and understanding smoking behaviour and the impact of tobacco control policies among smokers in China.

The main questionnaire for the adult smoker survey includes measures of the demand reduction policies of the FCTC, such as labelling, price/taxation, advertising/promotion, smoke-free, cessation, education, and measures on behaviour and psychosocial characteristics. Most of these measures are common for all ITC surveys but some are specifically designed for the ITC China Survey. For example, the Wave 1 surveys (for both smokers and non-smokers) included a set of questions on the International Quit-and-Win Competition, an ongoing event organised by the Office of Tobacco Control of China CDC. The Wave 2 smoker survey included questions on alcohol consumption, intended to bring statistical evidence to bear on hypothesised psychological and behavioural linkages between drinking and smoking.

The Wave 1 final versions of the smoker and non-smoker surveys were pre-tested in a pilot survey conducted in Wuhan and Shenyang in September and October 2005. The pre-test gave the ITC China team an opportunity of going through the entire process of conducting face-to-face interviews and identifying areas for improvement before the formal launch of the survey in the six cities. One particular aspect of the ITC China Survey is how to effectively use the Ju Wei Hui staff members to play a pivotal role in making the initial contact with the respondents and helping the interviewers to approach and enter the household for the survey. The pre-test also provided valuable feedback on unclear or even confusing wordings of some of the health knowledge and attitude-related questions, which led to further changes and improvement to the surveys.

Procedure

The ITC China Survey was conducted through face-to-face interviews. After the potential respondent was provided with information about the survey and completion of the consent form, the average time to complete a survey was 31.4 minutes for smokers and 10.6 minutes for non-smokers, with respective interquartile ranges (IQR) around 10 minutes and 5 minutes. Interviewers followed a strict protocol in their interview session with each respondent. Up to four visits to a household were made in order to interview the target person(s) within that household.

Survey team

The ITC China team consists of members from the Chinese Center for Disease Control and Prevention (China CDC) and international members from the ITC project. At each city, a project coordinator was appointed at the provincial or city CDC, and the project coordinator subsequently assembled a team consisting of one or two deputy team leaders, one data manager, one quality controller and 20 interviewers. Most of these people were staff members at the local CDC, Jie Dao or Ju Wei Hui, who were associated with the China CDC system. Some of the interviewers in Yinchuan were recruited from students at a local medical school. Team members at the China National CDC as well as international team members were overseeing all major steps in the survey execution.

Training

All survey-related materials, including questionnaires, training and quality control manuals, were fully discussed and finalised at a pre-survey workshop. Participants of the workshop included the international team members, members from the China National CDC and representatives from each of the cities. The workshop provided a platform for key team members to have some commonality on the ITC China Survey project, to work out details for the training and fieldwork organisation, to foresee potential problems and to suggest possible solutions.

There were two training manuals developed, one for the enumeration process and one for the survey interview. The complete enumeration of all adults living in the 300 randomly selected households within each selected Ju Wei Hui for basic demographic information and smoking status is the first crucial step of the survey. The enumeration data not only served as a basis for the final stage sample selection of individuals but also provided a rich source for the estimation of prevalence for different age-gender groups. This task was carried out by local Ju Wei Hui staff members, with training provided by each city. Training of interviewers was also organised at the city level, with support and supervision from the ITC China team members both at the China National CDC and at the ITC Project Data Management Centre at the University of Waterloo.

Quality control

Several quality control procedures were put in place. One was a three-level checking of finished questionnaires. The ITC China team established an efficient reporting and communication system among the interviewers, the data manager and the quality controller of each city, and the central team members at the National CDC. A standard checklist was created for each of the three levels: the interviewer, the city quality controller and the designated central team member. Another major quality control procedure was the practice of making MP3 recordings for each of the 800 smoker interviews in each of the six cities. These recordings were valuable not only in monitoring the quality of each interviewer's work, but also in alerting the research team to ways of improving the interview script for the survey and in identifying and correcting errors occurred during the data entry process.

Sample data

Wave 1 of the ITC China Survey was conducted in February to April 2006, and the Wave 2 survey was conducted from October to February 2008. The final sample sizes in each of the six cities varied slightly from the target of 800 smokers and 200 non-smokers. There were consistency and validity checks on all respondents, which excluded several cases from the final datasets. One scenario for exclusion was that a respondent in the smoker survey answered “No” to the screening question “Have you smoked 100 cigarettes or more in your lifetime?” Other scenarios included that a respondent had missing values on gender or birth date, or there were mismatches on key identification variables between Wave 1 and Wave 2 data entries for the same respondent.

Cooperation and response rates at Wave 1

The Wave 1 cooperation and response rates (%) for the six cities are summarised in table 2 for the adult smoker survey. The cooperation rate is calculated as the ratio of the number of completed interviews and the total number of successful contacts which include both completed interviews and refusals. The response rate is computed as the ratio of the number of completed interviews and the total number of smokers selected in the initial sample. The cooperation rates and response rates presented in table 2 for Shenyang, Shanghai and Yinchuan are exact. The project coordinators at the other three cities unfortunately did not give clear instructions prior to the field work on collecting these data and the interviewers did not keep records on the number of refusals and the number of unsuccessful contacts. The cooperation rates and response rates for these three cities are estimates only, with the missing numbers recalled by the interviewers and the Ju Wei Hui staff members who accompanied the interviewers through the entire course of field work.

Table 2

Wave 1 cooperation and response rates

The cooperation rates are comparable to those in the ITC-4 Survey but the response rates are generally higher than the telephone interview response rates in the ITC-4 Survey.

Retention and replenishment at Wave 2

The overall retention rates for the combined six cities were 81.6% for smokers and 83.9% for non-smokers. The number of respondents retained, as well as the corresponding retention rates (in parentheses), for each of the six cities, are given in table 3 for smokers and in table 4 for non-smokers. The retention rates for Shenyang and Guangzhou are much lower than for the other four cities, owing to the replacement of an entire Jie Dao or Ju Wei Hui from the Wave 1 sample. The replenishment sample sizes are also included in tables 3 and 4.

Table 3

Wave 2 retention rates and replenishment sample sizes for smokers

Table 4

Wave 2 retention rates and replenishment sample sizes for non-smokers

Statistical methods

Substitution of units

The ITC China Survey employed a stratified multistage cluster sampling design. The primary sampling units, the Jie Dao, and the secondary sampling units, the Ju Wei Hui, were selected using the randomised systematic PPS sampling method, with selection probabilities proportional to the unit population size. The list of 300 households enumerated for each selected Ju Wei Hui was initially conceived as large enough to meet the sampling requirement for not only the first wave baseline survey but also the replenishment samples in subsequent waves. The inclusion probabilities, which are required for weight calculation, can be obtained through a simple rescaling of the Jie Dao or Ju Wei Hui population sizes under the initial PPS sampling design.

The original ITC China Survey sampling design was altered in Guangzhou, where one Ju Wei Hui was replaced by a substitute unit, and also in Shenyang, where one Jie Dao (two Ju Wei Hui) was replaced by another one, because of unforeseeable changes in these two cities. When a multistage cluster sampling design is modified by substitution of units, the inclusion probabilities for the modified design can no longer be computed by the same method based on the initial sampling procedure. For the ITC China Survey, the question can be formulated more specifically as follows: when the original sample units were selected by a randomised systematic PPS sampling method, and some units were later replaced by substitute units, selected from units not included in the original sample by the randomised systematic PPS sampling method, how should the inclusion probabilities for the final sample be computed?

The question is not only of practical interest here for the ITC China Survey Project but also of theoretical interest since substitution of units often occurs in other surveys. Unfortunately, this seemingly simple question does not have a simple answer. Motivated by this particular need from the ITC China Survey, Thompson and Wu8 proposed a simulation-based approach to assessing the effect of substitution of units for the randomised systematic PPS sampling methods. When all design information is available, which is the case for the ITC China Survey, the inclusion probabilities for the final modified design can be approximated through Monte Carlo simulations. Two important observations are especially relevant to the ITC China Survey: (i) when a PPS sampling procedure is modified owing to substitution of units, the resulting inclusion probabilities are no longer proportional to the size measure, even if the substitute units are selected by the same PPS sampling method; (ii) the impact of substitution of units on the final inclusion probabilities depends on the sizes of the units being replaced. If the units being replaced are of average size, the final inclusion probabilities under the modified sampling design are nearly proportional to the unit size. The replaced Ju Wei Hui in Guangzhou and the substituted Jie Dao in Shenyang were both of average size. It was decided that weight calculations for both cities could proceed as if the sampling design was still PPS after the replaced unit was removed from the sampling frame.

Weight calculation

For Wave 1 data, the weights were simply calculated as the reciprocal of the inclusion probabilities, and were constructed separately for male adult smokers, female adult smokers, and adult non-smokers. While the inclusion probabilities under a multistage sampling design are usually calculated as a product of the sequence of conditional inclusion probabilities from top to bottom, the weights are most conveniently constructed from bottom to top at the four levels of sample selection: individual, household, Ju Wei Hui and Jie Dao. The final Wave 1 weight for a sampled individual was the number of people in the city population and the sampling category represented by that individual.

For Wave 2 data, two sets of weights were calculated: the Wave 2 longitudinal weights for all successful re-contacts, and the Wave 2 cross-sectional weights for all individuals surveyed at Wave 2, including both the re-contacts and the replenishment sample. The Wave 2 longitudinal weights were based on the Wave 1 weights but were re-scaled at both the household and individual level to adjust for attrition; the Wave 2 cross-sectional weights were constructed by pooling together the re-contacts and the replenishment sample, and computations conducted were guided in accordance with features from the combined sampling design (cohort and cross-sectional) at Wave 2.

Acknowledgments

The authors would like to acknowledge the Chinese Center for Disease Control and Prevention and the local CDC representatives in each city for their role in data collection. The authors thank Dr Simon Chapman for constructive comments and suggestions which led to improved presentation of the paper.

References

Supplementary materials

Footnotes

  • Funding The ITC China Project was supported by grants from the US National Cancer Institute (R01 CA125116 and the Roswell Park Transdisciplinary Tobacco Use Research Center (P50 CA111236)), Canadian Institutes of Health Research (79551), Chinese Center for Disease Control and Prevention, and the Ontario Institute for Cancer Research.

  • Competing interests None.

  • Patient consent Obtained.

  • Ethics approval Ethics approval was obtained from the Office of Research Ethics at the University of Waterloo (Waterloo, Canada), and the Internal Review Boards at: Roswell Park Cancer Institute (Buffalo, USA), the Cancer Council Victoria (Melbourne, Australia), and the Chinese Center for Disease Control and Prevention (Beijing, China).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • i There was a seventh city in the first two waves—Zhengzhou—but the quality of the data from that city was not sufficiently high; thus, the data from that city are not included in the overall ITC China Survey dataset.