PHORECAST: Enabling AI Understanding of Public Health Outreach Across Populations
Rifaa Qadri, Anh Nhat Nhu, Swati Ramnath, Laura Yu Zheng, Raj Bhansali, Sylvette La Touche-Howard, Tracy Marie Zeeger, Tom Goldstein, Ming Lin
TL;DR
PHORECAST presents a large multimodal dataset that ties public health campaign media to rich participant profiles (demographics, personality, locus of control) to enable fine-grained prediction of both individual and community responses to health messaging. By applying LoRA-based fine-tuning and feature-randomization, the study benchmarks vision-language models on predicting opinion indicators and free-form responses, revealing substantial improvements over baselines and highlighting the value of demographic and psychographic conditioning. The work includes thorough ablations showing that education, personality facets, LOC, and in-context Q/A cues differentially drive predictive accuracy, and demonstrates the important role of visual stimuli in shaping precise judgments. While demonstrating promising gains and a path toward socially aware, personalized public health AI, the paper also discusses limitations (e.g., English-speaking, United States–centric samples) and points to future work on temporal dynamics and cross-cultural generalization to broaden applicability.
Abstract
Understanding how diverse individuals and communities respond to persuasive messaging holds significant potential for advancing personalized and socially aware machine learning. While Large Vision and Language Models (VLMs) offer promise, their ability to emulate nuanced, heterogeneous human responses, particularly in high stakes domains like public health, remains underexplored due in part to the lack of comprehensive, multimodal dataset. We introduce PHORECAST (Public Health Outreach REceptivity and CAmpaign Signal Tracking), a multimodal dataset curated to enable fine-grained prediction of both individuallevel behavioral responses and community-wide engagement patterns to health messaging. This dataset supports tasks in multimodal understanding, response prediction, personalization, and social forecasting, allowing rigorous evaluation of how well modern AI systems can emulate, interpret, and anticipate heterogeneous public sentiment and behavior. By providing a new dataset to enable AI advances for public health, PHORECAST aims to catalyze the development of models that are not only more socially aware but also aligned with the goals of adaptive and inclusive health communication
