Objective The nine pictorial health warning labels (PWLs) proposed by the US Food and Drug Administration vary in format and feature of visual and textual information. Congruency is the degree to which visual and textual features reflect a common theme. This characteristic can affect attention and recall of label content. This study investigates the effect of congruency in PWLs on smoker’s attention and recall of label content.
Methods 120 daily smokers were randomly assigned to view either congruent or incongruent PWLs, while having their eye movements recorded. Participants were asked to recall label content immediately after exposure and 5 days later.
Results Overall, the image was viewed more and recalled better than the text. Smokers in the incongruent condition spent more time focusing on the text than smokers in the congruent condition (p=0.03), but dwell time of the image did not differ. Despite lower dwell time on the text, smokers in the congruent condition were more likely to correctly recall it on day 1 (p=0.02) and the risk message of the PWLs on both day 1 (p=0.01) and day 5 (p=0.006) than smokers in the incongruent condition.
Conclusions This study identifies an important design feature of PWLs and demonstrates objective differences in how smokers process PWLs. Our results suggest that message congruency between visual and textual information is beneficial to recall of label content. Moreover, images captured and held smokers’ attention better than the text.
- health warning labels
Statistics from Altmetric.com
The Family Smoking Prevention and Tobacco Control Act of 2009 gave authority to the Food and Drug Administration (FDA) to regulate tobacco products, including the requirement that cigarette packages and advertisements have larger and more visible pictorial health warning labels (PWLs). Required warnings must cover the top 50% of the front and rear panels of packages and must occupy at least 20% of the upper portion of each cigarette advertisement.1 The implementation of PWLs in the USA has been delayed because of a lawsuit by tobacco companies.2 One way to help to respond to legal challenges is by demonstrating that the images do not communicate a separate message but only reinforce the text, emphasise its salience and increase the likelihood that the risk message will be remembered.2 However, the nine PWLs proposed by the FDA vary in their degree of congruency between visual and textual information. In congruent PWLs, visual and textual features reflect a common theme (eg, the image portrays a diseased lung, while the text states ‘Cigarettes cause fatal lung disease’). In contrast, incongruent PWLs display an image (eg, a tracheotomy releasing smoke) and text (‘Cigarettes are addictive’) that do not convey the same message. Although the messages in incongruent PWLs communicated by the image and the text are not incorrect, their processing might require more inferential steps with the effect of greater cognitive load3. The degree of incongruence might therefore affect the processing of visual and textual information as well as the memory of the overall risk message and undermine rather than reinforce their effect.
Previous research has shown that PWLs on cigarette packages are more effective than text-only warnings with regard to attention to warnings,4 cognitive and emotional reactions to warnings, changes in beliefs about smoking, intentions and attempts to quit smoking and forgoing a cigarette.5 6 Evidence of whether the recall of PWLs is superior to that of text-only warning labels has been mixed.7–9 In a study comparing PWLs and text-only warning labels, the inclusion of the images did not generate superior aided recall.7 In a study by Nonnemaker et al,8 only one of the nine FDA-proposed PWLs was recalled better when the image was included. In contrast, a study of PWLs in cigarette advertisements on recall of label content found a significant difference in correct recall of label content between text-only and PWL conditions (50% vs 83%).9
In the presence of both image and text, textual information that conveys the same message as the image might enhance recall of information.10 Previous research has shown that ads were recalled better when their semantic content was congruent with the editorial material.11–14 However, although one study found that congruency had no effect on attention to the ads,13 another study11 found that incongruent ads were viewed longer than congruent ads. Semantic priming,15 which has been shown to influence processing fluency,16 has been proposed as a theoretical explanation for memory effects. Here, editorial context acts as a prime and activates viewers’ related knowledge in memory.17 As a result, information consistent with the activated network is automatically processed and becomes more accessible to the viewer’s memory,18 is processed more easily and, consequently, recalled better. Although attention is directed to incongruent ads, these would be poorly recalled because of their incompatibility with the activated network. Congruency of visual and textual information in PWLs might also benefit from semantic priming and processing fluency and, therefore, improve recall of label content.19 20
The aim of the current study was to examine whether a congruency–incongruency design feature affects attention and improves immediate and longer-term recall of label content. Based on previous advertising literature,11–14 we hypothesised that label content of congruent PWLs would be recalled better than that of incongruent PWLs immediately and 5 days later. Eye tracking was used to better understand how attention is divided between image and text. We had no a priori hypotheses on whether attention would differ by level of congruency, but previous work in this area has shown the important association between attention and recall.4 9 21 22
Sample and procedure
Participants responding to newspaper advertisements and to postings on the advertisement website craigslist were phone screened for eligibility. Inclusion criteria were as follows: 21–65 years old; speaking English fluently; no visual impairments; smoking at least five cigarettes (commercially made) daily for at least 1 year; not currently trying to quit, not intending to quit (next 3 months) and not using smoking cessation products; not pregnant or breastfeeding; drinking less than 25 alcohol-containing drinks per week; no current depression, antipsychotic medications or daily opiate-containing medications; no substance use disorders in the last year or previous diagnosis for schizophrenia disorder; and carbon monoxide (CO) >5 ppm at intake session (baseline CO breath sample to biochemically verify smoking status). Two of the participants fell under this CO threshold and were excluded from the study. Data from six participants (5%) were excluded from the analyses because of low-quality eye movement data (calibration difficulties or technical problems with the eye-tracking equipment) resulting in a final sample of 112 daily cigarette smokers.
The study consisted of two 1-hour sessions separated by 5 days. In the initial session (time 1), participants gave informed consent and provided a breath alcohol sample. Participants then smoked one of their own cigarettes to standardise the time since their last cigarette.21 23 CO breath samples were assessed before and after smoking. Participants completed baseline questionnaires and were randomly assigned to one of the two PWL conditions (congruent or incongruent). They were seated in front of the eye-tracking device that was calibrated, and they viewed each of four PWLs for 20 s,9 21 separated by 15 s of a screen with a fixation cross at its centre. After viewing, participants were asked to recall the PWLs. Five days later (time 2), participants once again had their breath alcohol and CO assessed, followed by smoking one of their own cigarettes and postsmoking CO measurement. Participants again completed recall measures. Finally, participants were debriefed and received $100 compensation. This protocol was approved by the University of Pennsylvania Institutional Review Board.
Figure 1 presents the FDA-proposed PWLs that were used in the congruent and incongruent conditions. The PWLs were evaluated by three independent trained raters to characterise congruency between image and text. Raters were naïve to the study purpose at the time of coding. Possible responses about image and text were ‘same message’, ‘different message’, ‘somewhat similar message’ and ‘not certain’. Intraclass correlation coefficient (ICC) between raters was excellent (ICC (7, 14)=0.85, 95% CI 0.53 to 0.97, p<0.001).
Demographic and smoking history measures
Age, gender, race, ethnicity, educational background, years of smoking, daily cigarette consumption, age of first cigarette, craving,24 readiness to consider smoking cessation25 and nicotine dependence26 were assessed.
Gazetracker software (V.07.01.243.128, Eye Response Technologies, Charlottesville, Virginia) was used to display the PWLs. Eye movements were measured using an Eye-Trac six-control unit with an R6 pan/tilt optics system and a video head tracker (Applied Science Laboratories, Boston, Massachusetts) and the Eye-Trac 6 User Interface Program (V.188.8.131.52). Areas of interest (AOIs) were identified a priori for each PWL and consisted of the image (AOI image) and the text warning (AOI text). For each AOI, the dwell time (total time viewed in AOI; in seconds), the latency (time to first viewing of AOI; in seconds) and the latency duration (duration of initial fixation; in seconds) were assessed.9 21 Fixations were operationalised as any 60-pixel-diameter space with three consecutively sampled observations with a minimum 200 ms duration.27
Participants were asked to recall the image (eg, ‘Describe the picture in the first warning label’), the text (eg, ‘Based on the first warning label, what did the text read?”) and the risk message (eg, ‘In your own words, what is the main health or risk message of the first warning label?”) of the viewed PWLs.8 Three trained coders unaware of the study hypothesis scored each statement according to pre-established guidelines as correct or incorrect. In the incongruent condition, the risk message was scored as correct if the participant recalled the message that either the image or the text conveyed. This recall measure and its scoring are derived from standard open-ended recall procedures.9 28 ICC values for the recall of the image, the text and the risk message at times 1 and 2 ranged from a low of 0.60 (95% CI 0.27 to 0.65) for the PWL ‘incongruent 3’ at time 2 to a high of 0.88 (95% CI 0.85 to 0.91) for the PWL ‘congruent 1’ at time 1; the overall mean ICC was 0.79 (95% CI 0.65 to 0.92). All ps<0.01 and all ICCs exceeded the minimum for good reliability.23 29
Data were analysed in SPSS V.24. Descriptive statistics were used to characterise the overall sample and each PWL group. Independent-samples t-tests and χ2 tests were conducted to identify potential group differences. Independent-samples t-tests were performed to examine differences between conditions in attention measures and in the number of correctly recalled PWLs at times 1 and 2. In those analyses, the mean of the correctly recalled content averaged across the four different PWLs was used as outcome measure (range 0–4). Repeated-measures ANOVAs were conducted to predict change in correct recall over time using condition as between-subject factor and time as within-subject factor. Linear regression analyses were conducted to determine models of predicting correct recall of the image, the text and the risk message (collapsed over the four PWLs) at times 1 and 2.
Descriptive statistics and randomisation verification
Table 1 displays sample characteristics by condition. Participants were on average 32.6 years old (SD=9.90; range=21–61). The sample was predominantly male (61.6%) and mostly Caucasian (49.1%) or African American (43.8%). Participants reported smoking for an average of 15.49 years (SD=9.65) and 15.69 cigarettes/day (SD=8.42) with an average nicotine dependence score of 4.33 (SD=2.24). Independent-samples t-tests and χ2 tests showed no differences between conditions in descriptive and smoking characteristics.
Differences between conditions in attention
Attention data showed that smokers focused longer on the AOI image (M=12.93 s, SD=2.28) while focusing on the AOI text on average for 4.58 s (SD=1.68; t(111)=24.43, p<0.001). Latency to the AOI image (M=0.18 s; SD=0.38) was shorter compared with the AOI text (M=1.34 s, SD=1.29; t(101)=−8.96, p<0.001). Latency duration for the AOI image was 1.22 s (SD=1.22) on average and 0.88 s (SD=0.64) for the AOI text (t(101)=2.45, p=0.02).
Smokers who viewed the incongruent PWLs spent more time focusing on the AOI text than smokers who viewed the congruent PWLs (M=4.91 s, SD=1.80 vs M=4.21 s, SD=1.46; t(1,110)=−2.22; p=0.03) (see figure 2). The results also showed an effect of condition on latency to the AOI image. Smokers who viewed the congruent PWLs focused faster on the AOI image than smokers who viewed the incongruent PWLs (M=0.09 s, SD=0.36 vs M=0.24 s, SD=0.39; t(1,100)=−2.04; p=0.04). No effects of condition on latency duration of the AOIs image or text were found.
Differences between conditions in recall at times 1 and 2
Table 2 displays correct recall of the image, the text and the message at times 1 and 2 for each PWL. Overall, correct recall of the image and the message at times 1 and 2 were relatively high; however, smokers had relative greater difficulty recalling the text than the image. Repeated-measures ANOVAs were conducted to examine change in correct recall over time using condition as between-subject factor and time as within-subject factor. There was no significant effect of time (Wilks’ Lambda=0.98, F(1, 110)=2.16, p<0.14) or condition (F(1, 110)=0.76, p<0.39) on the recall of the image. Correct recall of the text decreased in both conditions (time: Wilks’ Lambda=0.84, F(1, 110)=20.89, p<0.001), and recall was better in the congruent condition (F(1, 110)=5.72, p<0.02). There was no change in recall of the message (time: Wilks’ Lambda=0.98, F(1, 110)=2.47, p=0.12), but the message was better recalled in the congruent condition (F(1, 110)=11.94, p=0.001).
Independent-samples t-tests were performed to examine differences between conditions in the number of images, texts and messages that were recalled correctly at times 1 and 2 (see figure 3). At time 1, smokers in the congruent condition were more likely to correctly recall the text (M=1.69, SD=1.11 vs M=1.19, SD=1.00; t(1,110)=2.48; p=0.02) and the message (M=2.94, SD=0.96 vs M=2.29, SD=1.17; t(1,110)=3.21; p=0.002) than smokers in the incongruent condition. No significant difference between conditions was found regarding the number of images that were recalled correctly (p=0.17). Five days later, smokers in the congruent condition were more likely to correctly recall the message than smokers in the incongruent condition (M=2.78, SD=1.24 vs M=2.10, SD=1.31; t(1,110)=2.80; p=0.006). No significant differences between conditions were found regarding the number of images (p=0.77) and texts (p=0.11) that were recalled correctly.
Predicting recall at times 1 and 2
Regression models were performed to investigate the association between attention and correct recall at time 1 and attention and correct recall at time 2. Attention measures (dwell time of the AOIs image and text, latency to the AOIs image and text and latency duration of the AOIs image and text) were unrelated to correct recall of the image, the text and the message at times 1 and 2. In three independent linear regression models, correct recall of the image (b=0.57, t(110)=4.18, p<0.001), the text (b=0.48, t(110)=6.34, p<0.001) and the message (b=0.62, t(110)=6.50, p<0.001) of the PWLs at time 1 significantly predicted correct recall at time 2, respectively.
The aim of this study was to examine the effect of message congruency in PWLs on attention and on immediate and longer-term recall of label content. Results indicate that although smokers in the congruent condition focus less on the text, they were more likely than smokers in the incongruent condition to correctly recall it on day 1 and to recall the message of the PWLs on both days 1 and 5. The observed positive congruency effect on memory supports the results observed in advertising research.12–14 Congruency may have evoked semantic priming effects that increased processing fluency and improved memory for the information. That congruency affects attention and memory differently is also in accordance with previous research. Incongruent ads attracted more attention and generated more effortful processing than congruent ads.3 30 31 For congruent PWLs, the category might already be activated by the image. As a result, processing the text requires less attention for the overall message to be comprehended.11 Longer dwell time on the text in incongruent PWLs might also be due to alternating attention between the image and the text in order to produce a complete mental model of the overall message.32
Independent of condition, smokers focused faster and longer on the image than on the text, suggesting that images in PWLs capture and hold smokers attention. However, smokers moved their attention relatively quickly (after approximately 1 s) to the text. This could be because smokers were exposed to the PWL only and not to a cigarette pack or advertisement that included the PWL. Previous research has shown that smokers latency to text is relatively long when included in a cigarette advertisement,21 exceeding the average total viewing time of a print ad.33 It remains unclear why smokers in the congruent condition focused faster on the image than smokers in the incongruent condition. Because smokers in both conditions focused faster on the image than on the text, this result should be treated with caution.
Overall, smokers recalled the image and the message relatively well while having difficulties recalling the text. There was no change in recall of the image and the message over time, but correct recall of the text decreased from time 1 to time 2. Together, this pattern of results emphasises the importance of images in PWLs and supports the idea that the overall message is enhanced by the image.
It is surprising that the attention–recall association that has been demonstrated in previous PWL research could not be replicated. This study differs from previous work, for example, in the format of exposure (to PWLs only compared with PWLs on packs or advertisements), which might be a possible explanation for the lack of findings. This effect could therefore differ if PWLs were presented in the context of cigarette branding content. However, our results suggest that proximal recall is strongly associated with distal recall (5 days later). Therefore, using features and formats that maximise initial attention and recall may help long-term effectiveness.
Some limitations should be acknowledged. First, we did not assess the effects of congruent visual and textual information on smoking/quitting behaviour. However, because a goal of PWLs is to inform smokers about the health risks associated with tobacco,34 35 examining the effects on memory performance is important. Second, we used the existing FDA-proposed PWLs to investigate congruency effects. This increased external validity, but as a consequence, the sets of PWLs differed in other ways than in their level of congruency (eg, location, structure or length of the text, framing of the message36 and image type37). There are possible confounding variables whose influence on both, congruency and recall, could account for their association. Future research should experimentally manipulate these factors and investigate whether they enhance recall. Third, we combined recall across congruent/incongruent PWLs. To rule out that a single PWL is driving the congruency effect, further research needs to investigate the effects of congruency on individual PWLs and over an extended period of time.
To develop regulations, to effectively design new PWLs or to improve the effectiveness of the nine FDA-proposed PWLs, legislators and policymakers should consider two aspects. First, our results suggest that the text in PWLs should be accompanied by images, as they captured and held smokers’ attention and were recalled better than the text. Previous research has also emphasised the superiority of visual as opposed to textual information.6 9 32 38–40 Moreover, smokers largely ignore the text21 41–45 and have difficulties recalling it when placed in cigarette advertisements.9 41 As images drive information processing, they should clearly and explicitly portray the message. Second, our results also indicate that visual and textual information in PWLs should be congruent. In incongruent PWLs, viewers face the task of integrating two messages in the comprehension process. It is unclear what message some of the images in incongruent PWLs communicate (eg, PWL ‘Incongruent 3’). Therefore, the correct comprehension of the image/message depends on and requires an understanding of the text. Because smokers spend only little time on advertisements and PWLs,32 41 the initial impression of the PWL is valuable. Message congruency would increase the likelihood of smokers remembering the overall message.
In the USA, recent court rulings have prevented the adoption of pictorial warnings partly because the images lacked a factual basis for inclusion. Results from this study show that images capture and hold attention better than text. Also, factually true text information delivered with supportive or congruent visual information is better recalled than versions where text and image are incongruent. Congruent PWLs simply provide a means to deliver the same information in image and text format and might therefore be more useful in resisting legal challenges in the USA. Moreover, if replicated, these results might be important to other countries that have developed and implemented PWLs already and reach the point of generating new content or updating content of existing PWLs.
What this paper adds
Images in pictorial health warning labels capture and hold attention, as objectively measured with eye tracking.
Results suggest that images capture and hold smokers’ attention better than text.
Congruency, the degree to which visual and textual features reflect a common theme, is beneficial to memory of label content.
Purely factual text information supported by congruent images maybe an important policy approach.
Contributors AAS conceived and supervised the study. Data collection was conducted by KZT. KL completed the analyses and led the writing with major contributions from all authors. All authors have approved the final article.
Funding Research reported in this publication was supported by the National Cancer Institute (NCI) of the National Institutes of Health (NIH) and FDA Center for Tobacco Products (CTP) under Award Numbers P50CA179546, R01CA180929, and P20CA095856. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or the Food and Drug Administration (FDA).
Competing interests None declared.
Ethics approval University of Pennsylvania Institutional Review Board.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.