Table of Contents
Fetching ...

Working with Large Language Models to Enhance Messaging Effectiveness for Vaccine Confidence

Lucinda Gullison, Feng Fu

TL;DR

The paper investigates whether ChatGPT-augmented vaccine messaging can boost persuasive impact to counter hesitancy, addressing resource constraints faced by small public health departments. It employs a blind A/B survey (n≈138 valid responses) comparing original and ChatGPT-augmented messages across six items, with an ANOVA showing significant differences among messages ($p=6.68e-10$) and a non-significant overall ChatGPT effect ($p=0.4642$). Findings indicate ChatGPT-augmented messages are generally more persuasive, particularly longer ones, though responses exhibit a strong primacy effect whereby the first-listed message biases ratings. The study supports potential human-AI collaboration in public health messaging and highlights directions for future work, including larger samples, randomized order controls, tailored demographics, and testing across additional LLMs and prompts, to enhance practical impact in vaccination campaigns.

Abstract

Vaccine hesitancy and misinformation are significant barriers to achieving widespread vaccination coverage. Smaller public health departments may lack the expertise or resources to craft effective vaccine messaging. This paper explores the potential of ChatGPT-augmented messaging to promote confidence in vaccination uptake. We conducted a survey in which participants chose between pairs of vaccination messages and assessed which was more persuasive and to what extent. In each pair, one message was the original, and the other was augmented by ChatGPT. At the end of the survey, participants were informed that half of the messages had been generated by ChatGPT. They were then asked to provide both quantitative and qualitative responses regarding how knowledge of a message's ChatGPT origin affected their impressions. Overall, ChatGPT-augmented messages were rated slightly higher than the original messages. These messages generally scored better when they were longer. Respondents did not express major concerns about ChatGPT-generated content, nor was there a significant relationship between participants' views on ChatGPT and their message ratings. Notably, there was a correlation between whether a message appeared first or second in a pair and its score. These results point to the potential of ChatGPT to enhance vaccine messaging, suggesting a promising direction for future research on human-AI collaboration in public health communication.

Working with Large Language Models to Enhance Messaging Effectiveness for Vaccine Confidence

TL;DR

The paper investigates whether ChatGPT-augmented vaccine messaging can boost persuasive impact to counter hesitancy, addressing resource constraints faced by small public health departments. It employs a blind A/B survey (n≈138 valid responses) comparing original and ChatGPT-augmented messages across six items, with an ANOVA showing significant differences among messages () and a non-significant overall ChatGPT effect (). Findings indicate ChatGPT-augmented messages are generally more persuasive, particularly longer ones, though responses exhibit a strong primacy effect whereby the first-listed message biases ratings. The study supports potential human-AI collaboration in public health messaging and highlights directions for future work, including larger samples, randomized order controls, tailored demographics, and testing across additional LLMs and prompts, to enhance practical impact in vaccination campaigns.

Abstract

Vaccine hesitancy and misinformation are significant barriers to achieving widespread vaccination coverage. Smaller public health departments may lack the expertise or resources to craft effective vaccine messaging. This paper explores the potential of ChatGPT-augmented messaging to promote confidence in vaccination uptake. We conducted a survey in which participants chose between pairs of vaccination messages and assessed which was more persuasive and to what extent. In each pair, one message was the original, and the other was augmented by ChatGPT. At the end of the survey, participants were informed that half of the messages had been generated by ChatGPT. They were then asked to provide both quantitative and qualitative responses regarding how knowledge of a message's ChatGPT origin affected their impressions. Overall, ChatGPT-augmented messages were rated slightly higher than the original messages. These messages generally scored better when they were longer. Respondents did not express major concerns about ChatGPT-generated content, nor was there a significant relationship between participants' views on ChatGPT and their message ratings. Notably, there was a correlation between whether a message appeared first or second in a pair and its score. These results point to the potential of ChatGPT to enhance vaccine messaging, suggesting a promising direction for future research on human-AI collaboration in public health communication.

Paper Structure

This paper contains 10 sections, 4 figures.

Figures (4)

  • Figure 1: Overall subjects' ratings favored ChatGPT-augmented messages. Out of the six ChatGPT-augmented messages, four were consistently rated higher. Subjects' responses were typically bimodal, with an overall average slightly positive, suggesting that ChatGPT-augmented messages are generally perceived more favorable by subjects.
  • Figure 2: Subjects' positive views on ChatGPT have little impact on their message ratings. The histogram in panel (a) shows that subjects generally hold positive views of ChatGPT with a few exceptions ($n=138$, mean 1.557971, one sample t-test p-value $<2.2e-16$ ). Scatter plot of subjects' ChatGPT view and their message ratings in panel (b), along with correlational analysis ($n = 138$, correlation coeff $r = 0.06366351$, p-value: 0.458192), suggests that subjects' positive views on ChatGPT do not lead them to consistently rate ChatGPT-augmented message higher.
  • Figure 3: Longer ChatGPT-augmented messages are more likely to be rated higher than shorter ones. We plot a message's length and its subjects' evaluation score for all six ChatGPT-augmented messages ($n=138$, correlation coeff $r = 0.00807$, intercept =-0.17740 p-value 0.7661 ) in panel (a) and in panel (b) we focus on the four ChatGPT-augmented messages that were placed first ($n=138$, correlation coeff $r = 0.01145$, intercept =0.36207 p-value 0.5332).
  • Figure 4: Sequential effect of message placement order. While ChatGPT-augmented messages are placed randomly, either before or after the original, in our A/B testing, subjects evaluated the ones placed first significantly higher than those placed second (mean first: 0.788603, second:: -1.209558). The sample size was $n = 138$, and the Welch two-sample t-test, with an alternative hypothesis that the true difference in means is not equal to zero, yielded a p-value of 0.005718, which at an alpha level of 0.05, is statistically significant.