This paper describes the methods of sampling design and data collection of waves 1, 2 and 3 of the International Tobacco Control (ITC) China Survey, with major focus on longitudinal features of the study. Key measures of quality of the survey data, such as retention rates and final sample sizes, are presented. Sample replenishment procedures are outlined, including the addition of a new city, Kunming, at wave 3. Methods for constructing the longitudinal and cross-sectional survey weights are briefly described.
- Public policy
- Secondhand smoke
- Surveillance and monitoring
Statistics from Altmetric.com
The WHO Framework Convention on Tobacco Control (FCTC), the first ever international treaty on health adopted under Article 19 of the WHO constitution,1 is now legally binding in 179 ratifying countries, including China which ratified the treaty in 2005.2 Ratifying countries are required to implement nationwide tobacco control policies to meet provisions of the treaty and are encouraged to place even more stringent measures towards limiting and reducing the use of tobacco.
The International Tobacco Control Policy Evaluation Project (the ITC Project) was first created in 2002 in four English-speaking countries: Canada, USA, Australia and UK (the ITC-4 Country Survey). The scientific foundation of the ITC project was laid out in Fong et al.3 The sampling design and data collection methods for the ITC-4 survey project were described in Thompson et al.4 ‘Tobacco Control’ and ‘Policy Evaluation’ are the two main objectives behind the creation of the project and ‘international’ has become the most prominent feature of the project over the past 12 years. The ITC Project has conducted surveys in 22 countries, which cover 60% of the world population and 55% of tobacco users in the world.
China is the largest tobacco producer and consumer in the world, with more than 300 million smokers and more than 700 million non-smokers who are exposed to secondhand smoke. The ITC China Survey is a longitudinal survey of smoking behaviour among adults in China. It was launched in 2006 as one of the most significant expansions of the ITC Project. The broad objective of the project is to evaluate and understand the psychosocial and behavioural effects of national-level tobacco control policies of the FCTC. In addition to the quasi-experimental evaluation of change in policies, the cohort design of the ITC China Survey allows researchers to understand naturally occurring changes in smoking behaviour and their association over time with policies. Wu et al5 contains descriptions of the general methodology for the ITC China Survey.
The first wave of the ITC China Survey was conducted in seven Chinese cities between April and August 2006. One of the cities, Zhengzhou, was later dropped from the study. The second wave of the survey was conducted in the six remaining cities from October 2007 to January 2008. The third wave was conducted from May to October 2009. The third wave also added a new city, Kunming, into the study. Kunming is the capital city of Yunnan province, where the tobacco industry is a significant component of the province's economy. Another important event was the Olympic Games in Beijing in the summer of 2008, which happened between waves 2 and 3 of the ITC China Survey. A number of specific tobacco control policies, most noticeably smoke-free regulations in various public places, were implemented prior to the games in Beijing and other hosting cities. The ITC China Survey provided a unique tool to assess the effectiveness of those policies.
This paper describes the sampling design and data collection procedure of waves 1, 2 and 3 of the ITC China Survey, with major focus on longitudinal features of the study. Key measures of quality of the survey data, such as retention rates and final sample sizes, are presented. Sample replenishment procedures are outlined, including the addition of Kunming at wave 3. Methods for constructing the longitudinal and cross-sectional survey weights are briefly described.
The ITC China Survey design consists of the initial sampling design for wave 1 (see Wu et al5) and designs for replenishment samples at each follow-up wave. The overall sample sizes for each city are targeted as 800 adult smokers and 200 adult non-smokers. At each follow-up wave, respondents from the previous wave are contacted first, and replenishment sample sizes are determined based on the retention rates of the longitudinal samples. Smokers who changed status to ‘quitters’ or non-smokers who become smokers from one wave to another wave remain in the longitudinal samples.
The initial wave 1 survey design
The initial wave 1 survey employed a stratified multistage cluster sampling design. Each city is treated as a stratum. Within each city, the first stage clusters are the Street Districts (Jie Dao) and the second stage clusters are the Residential Blocks (Ju Wei Hui). In each of the six cities, 10 Jie Dao were randomly selected using the randomised probability proportional to size sampling method, with the probability of selection proportional to the population size of the Jie Dao. Within each of the selected Jie Dao, two Ju Wei Hui were selected with probability proportional to the population size of the Ju Wei Hui. A simple random sample of 300 households was taken from each selected Ju Wei Hui and a complete enumeration of those households was conducted prior to the selection of individual smokers and non-smokers for the final wave 1 sample. The enumeration process collected basic information on age, gender and smoking status (without rigorous screening) for all members in the listed households.
The total wave 1 sample sizes were 800 adult smokers and 200 adult non-smokers for each city, evenly allocated to the 20 selected Ju Wei Hui, with 40 adult smokers and 10 adult non-smokers from each Ju Wei Hui. Individuals from the 300 enumerated households were approached in a random order and 1 adult male smoker, 1 adult female smoker and 1 adult non-smoker from each household were recruited whenever possible until the corresponding category of the sample quota was filled.
Sampling design for waves 2 and 3 replenishment in continuing cities
The 300 enumerated households for each of the selected Ju Wei Hui were intended as a sampling frame not only for the wave 1 sample but also for the replenishment samples for the follow-up waves. There were considerable variations in the response rates at wave 1 and retention rates at the following waves, which implied that the 300 household enumeration lists were exhausted faster in some of the Ju Wei Hui than in others. There were two major factors influencing the waves 2 and 3 replenishment sampling design: (1) the availability of non-sampled units from the existing sampling frame from previous waves and (2) the projected replenishment sample sizes for future waves.
There were sufficient non-sampled units (households) from the initial wave 1 sampling frame to fill the quotas for replenishment samples at wave 2. It was decided prior to wave 3 that it was time for all continuing cities to consider adding a new Jie Dao, selecting one or two Ju Wei Hui from the new Jie Dao and building a 300-household enumeration list for each added Ju Wei Hui. The two general rules for the wave 3 replenishment sampling design were (1) to maintain the basic features of the original sampling design used for waves 1 and 2 and (2) to maintain the total overall sample size in each of the seven cities.
For the continuing cities, replenishment samples for wave 3 were taken from either the existing sampling frame or the newly added Ju Wei Hui. If the selection was carried from the existing sampling frame within a Jie Dao, the following procedures were used:
For each Ju Wei Hui, if there were enough non-sampled respondents from the original enumeration list of 300 households, replenishment respondents were to be taken from that list.
If the 300-household list had been exhausted by the wave 1 and 2 samples or was not sufficient for replenishment, but the Ju Wei Hui had additional households which were not enumerated in the wave 1 and 2 surveys, a new list of households was to be constructed (on top of the original 300 list) and enumerated, and the replenishment sample was to be taken from the new list.
If the Ju Wei Hui had no room for selecting a replenishment sample, the quota of replenishment sample for this Ju Wei Hui was to be fulfilled by the other sampled Ju Wei Hui within the same Jie Dao.
If the two sampled Ju Wei Hui in the Jie Dao did not have sufficient room for the replenishment sample, the quota of the replenishment sample for this Jie Dao would be fulfilled in an adjacent Jie Dao which was included in the initial wave 1 or 2 samples.
If a new Jie Dao and/or a new Ju Wei Hui needed to be added, the selection of that part of the replenishment sample was conducted by the ITC team at the Chinese Centre for Disease Control and Prevention (China CDC), using the following procedures:
The new Jie Dao was selected with probability proportional to the Jie Dao population size, among those Jie Dao which were not surveyed by waves 1 and 2; two Ju Wei Hui were selected within the new Jie Dao, with probability proportional to Ju Wei Hui population size.
If only one Ju Wei Hui was needed at wave 3 from the new Jie Dao, the Jie Dao was first divided in half in terms of population (depending on the number of Ju Wei Hui in the Jie Dao). One Ju Wei Hui was then selected from a chosen half of the Jie Dao, with probability proportional to Ju Wei Hui population size. The other half of the new Jie Dao could be used for replenishment samples in future waves. If two new Ju Wei Hui were required at wave 3, they were to be chosen with probability proportion to Ju Wei Hui population size from the whole Jie Dao.
In each selected new Ju Wei Hui, a list of 300 randomly selected households was enumerated first, and replenishment samples of smokers and non-smokers are selected from the enumerated households using the method from the wave 1 and 2 sampling design.
Sampling design for wave 3 in Kunming
One of the initial seven cities, Zhengzhou, was dropped from the ITC China Survey after wave 1, partially due to concerns with data quality but more importantly due to the lack of leadership at the city level. Kunming, the capital city of Yunnan Province, emerged as a replacement. Unfortunately, the CDC offices in Yunnan and Kunming were not able to undertake the task. Prior to wave 3, Dr Baifan Zhao of the Yunnan Health Education Institute and her team were enlisted to become part of the ITC China Survey team and to undertake the task of conducting the survey in Kunming.
Since the survey started in Kunming at wave 3, the sampling design followed exactly the same method used for wave 1 in other cities. Ten Jie Dao were selected, and two Ju Wei Hui were chosen from each selected Jie Dao. A list of 300 households were compiled and enumerated for each of the 20 selected Ju Wei Hui. Adult smokers and non-smokers were recruited from the enumerated households.
ITC China Survey data are collected through face-to-face interviews of respondents. The detailed procedures of wave 1 survey were documented in Wu et al.5 The surveys in waves 2 and 3 followed the same procedures and principles as those used in wave 1. The following are some highlights of procedures and measures used by the ITC China Survey at waves 2 and 3, as well as in subsequent waves.
Team building: The ITC central team consists of Dr Yuan Jiang and her staff at the China CDC, and Professor Geoffrey T Fong and several members from the ITC international team. Each city has an ITC local team consisting of a project leader, a fieldwork co-ordinator, a data manager, a quality controller and up to 20 interviewers. The interviewers in Yinchuan and Kunming were recruited from students in local medical schools; the interviewers in other cities were appointed from among the staff members in the local CDC or Ju Wei Hui offices.
Training workshops: There are two levels of training workshops. Each wave starts with a kick-off training workshop attended by all members of the ITC central teams and representatives from each city team. Some ITC international team members also attend the workshop. Each city team then organises the training workshops for interviewers, with training sessions run by members of the central team.
Fieldwork coordination: In waves 2 and 3, in addition to the leadership from the central team and the city team, staff members at the local Jie Dao and Ju Wei Hui offices were used to initiate the contacts and make appointment with the respondents. This procedure has turned out to be a crucial strategy in making it possible for the interviewers to enter the selected households, because many of the residential buildings have tight security measures and ‘strangers’ are unable to enter the building without a first point of contact with the residents. It is even more crucial for the follow-up interviews since finding the correct respondent from the previous wave and establishing an initial contact can be extremely hard without the help of those staff members from the local offices.
Incentives: Jie Dao and Ju Wei Hui staff members were paid 5 Renminbi (¥) per respondent for their co-ordination work. The respondents received a gift at the end of the interview, valued at ¥20 for smokers and ¥10 for non-smokers, as a token of thanks for their participation in the survey.
Quality control: The basic structure for quality control is the three-level checking of finished questionnaires, which includes self-checking by the interviewer, further checking by the city quality controller and the final checking by the central team members at the China CDC. The most important procedure, however, is the MP3 recording of all smoker survey interviews. The MP3 recording is useful to verify that a follow-up respondent matches the same individual from the previous wave and is also useful for correcting data errors.
Data entry: ITC China Survey has contracted a professional firm in Beijing for data entry. They use standard procedures such as ‘double entry’ and quality measures, such as ‘random sample checking with error rates less than 5/10 000’.
Further details on the complete list of team members, eligibility criteria, screening and main questionnaires, information and consent letters, training manuals, disposition codes, various forms, etc, can be found in the ITC China Technical Reports (waves 1, 2 and 3).6–8
For longitudinal studies, retention rates at subsequent waves are the most important measure for data quality. For the ITC China Survey, retention rates also dictate the sizes for replenishment samples. Tables 1⇓⇓–4 present the sizes of the longitudinal samples for adult smokers and non-smokers at waves 1, 2 and 3, for male and female respondents, with retention rates shown in parentheses.
It can be seen that the retention rates in Beijing and Shanghai are very high but the rates are quite low in Shenyang and to a certain degree also in Guangzhou and Yinchuan. One of the difficulties faced in Shenyang prior to wave 3 was a massive restructuring and relocation of residents in several parts of the city, which created obstacles for tracking down wave 2 respondents in several Ju Wei Hui. In Guangzhou, the survey team had issues with access to two Jie Dao at waves 2 and 3 where several residential areas are affiliated with the Chinese Army and tighter security measures had been put in place since the wave 1 survey; this made recontact very difficult and sometimes impossible.
Wave 2 cross-sectional samples consist of recontacts from wave 1 and the replenishment samples at wave 2. Smokers who became quitters in the next wave remained as part of the longitudinal sample for smokers with smoking status changed from ‘Smoker’ to ‘Quitter’. Table 5 presents the cross-sectional adult smoker sample sizes at waves 2 and 3. Wave 3 cross-sectional adult smoker samples consist of the last three columns in the table, that is, wave 3 (a): recontact smoker from wave 2; wave 3 (b): quitter from wave 2; and wave 3 (c): replenishment sample newly selected at wave 3.
Waves 2 and 3 cross-sectional samples for non-smokers are presented in table 6. The total sample sizes for wave 3 cross-sectional samples consist of the last two columns, that is, wave 3 (i) and (ii). Non-smokers from wave 2 who became smokers at wave 3 were moved to the wave 3 replenishment sample for smokers.
Survey weight calculation
Survey weights are often required for analysis of survey data. There are two types of analyses where survey weights are used in different ways. For the estimation of descriptive finite population parameters, such as totals and means, the basic design weights (also referred to as the expansion or inflation weights) are required. For analytic use of survey data where the focus is to explore relations among variables, some suitably rescaled survey weights are more appropriate because the objective of using a survey weighted analysis is to take into account the possible informative sampling design feature and at the same time to reduce the variation caused by the survey weights.
The ITC China Survey data from waves 1, 2 and 3 represent a sophisticated scenario for survey weight calculation. First, the basic design weights for the wave 1 survey data are calculated based on the multistage cluster sampling design. Wu et al5 contains a short description of the calculation of the initial survey weights for wave 1. Second, the cross-sectional survey weights at waves 2 and 3 need to consider the modified survey design due to the selection of replenishment samples at each wave. The modifications include added new clusters (Jie Dao or Ju Wei Hui) at wave 2 or 3, and the enlarged enumeration lists of households in some of the old clusters. Third, the longitudinal survey weights can take different forms depending on the types of data used for the analysis.
With three waves and replenishment samples at waves 2 and 3, there are three sets of longitudinal weights that are of interest: (1) waves 1–3 longitudinal weights; (2) waves 1–2 longitudinal weights and (3) waves 2–3 longitudinal weights. For adult smokers, the weights are calculated separately for the male group and the female group. Each set of weights is computed based on the cross-sectional weights at the initial wave and adjusted for attrition. Mathematical details on weight calculation are available in an internal ITC document.9
What this paper adds
Methods for the initial wave 1 International Tobacco Control (ITC) China Survey have already been published in Wu et al.5
Longitudinal design features and survey data characteristics for subsequent waves of the ITC China Survey need to be documented for research work using the ITC China Survey data.
This paper provides critical information on longitudinal features of the sampling design, data collection, data quality and survey weights of the ITC China Survey. It serves as a benchmark for other research papers using the wave 1, 2 or 3 data from the ITC China Survey.
The authors would like to acknowledge the Chinese Centre for Disease Control and Prevention, and the local CDC representatives in each of the six cities and the staff at Yunnan HEI for their role in data collection. They also thank Dr Qiang Li, a research scientist and the ITC China Survey project manager during waves 1–3, for his dedication to the project.
Contributors All contributors are members of the ITC China Survey team who played crucial roles in the design and execution of the survey, and contributed towards the completion of the paper.
Funding The ITC China Project was supported by grants from the US National Cancer Institute (R01 CA125116 and the Roswell Park Transdisciplinary Tobacco Use Research Centre (P50 CA111236)), Canadian Institutes of Health Research (57897, 79551 and 115016) and the Chinese Centre for Disease Control and Prevention. GTF was supported by a Senior Investigator Award from the Ontario Institute for Cancer Research and by a Prevention Scientist Award from the Canadian Cancer Society Research Institute.
Competing interests None.
Patient consent Obtained.
Ethics approval Ethics approval was obtained from the Office of Research Ethics at the University of Waterloo (Waterloo, Canada), and the internal review boards at: Roswell Park Cancer Institute (Buffalo, USA), the Cancer Council Victoria (Melbourne, Australia) and the Chinese Centre for Disease Control and Prevention (Beijing, China).
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.