Table of Contents
Fetching ...

LocalTweets to LocalHealth: A Mental Health Surveillance Framework Based on Twitter Data

Vijeta Deshpande, Minhwa Lee, Zonghai Yao, Zihao Zhang, Jason Brian Gibbons, Hong Yu

TL;DR

This work introduces LocalTweets, a neighborhood-level mental health surveillance dataset built from 765 US census block groups (2015–2019) and paired with CDC MH outcomes, and LocalHealth, a language-model–based predictor that ingests locally posted tweets to forecast MH prevalence. The authors systematically compare keyword-filtered versus unfiltered (general) tweets and evaluate multiple encoders, showing general tweets often generalize best, while domain-adapted and larger models (e.g., GPT-3.5) yield strong zero-shot performance. They demonstrate that incorporating ADI improves prediction and that LocalHealth can extrapolate CDC estimates to unreported neighborhoods with competitive accuracy (e.g., F1 around 0.73). The framework lays groundwork for real-time, neighborhood-focused MH surveillance and resource allocation guidance, while emphasizing privacy, ethical use, and reproducibility. Key contributions include the LocalTweets benchmark, the LocalHealth regression/prediction pipeline, and a detailed analysis of data availability and spatial extrapolation in population-level MH forecasting. Finally, the work offers practical implications for public health policy and suggests future expansions to other health domains and equitable resource distribution.

Abstract

Prior research on Twitter (now X) data has provided positive evidence of its utility in developing supplementary health surveillance systems. In this study, we present a new framework to surveil public health, focusing on mental health (MH) outcomes. We hypothesize that locally posted tweets are indicative of local MH outcomes and collect tweets posted from 765 neighborhoods (census block groups) in the USA. We pair these tweets from each neighborhood with the corresponding MH outcome reported by the Center for Disease Control (CDC) to create a benchmark dataset, LocalTweets. With LocalTweets, we present the first population-level evaluation task for Twitter-based MH surveillance systems. We then develop an efficient and effective method, LocalHealth, for predicting MH outcomes based on LocalTweets. When used with GPT3.5, LocalHealth achieves the highest F1-score and accuracy of 0.7429 and 79.78\%, respectively, a 59\% improvement in F1-score over the GPT3.5 in zero-shot setting. We also utilize LocalHealth to extrapolate CDC's estimates to proxy unreported neighborhoods, achieving an F1-score of 0.7291. Our work suggests that Twitter data can be effectively leveraged to simulate neighborhood-level MH outcomes.

LocalTweets to LocalHealth: A Mental Health Surveillance Framework Based on Twitter Data

TL;DR

This work introduces LocalTweets, a neighborhood-level mental health surveillance dataset built from 765 US census block groups (2015–2019) and paired with CDC MH outcomes, and LocalHealth, a language-model–based predictor that ingests locally posted tweets to forecast MH prevalence. The authors systematically compare keyword-filtered versus unfiltered (general) tweets and evaluate multiple encoders, showing general tweets often generalize best, while domain-adapted and larger models (e.g., GPT-3.5) yield strong zero-shot performance. They demonstrate that incorporating ADI improves prediction and that LocalHealth can extrapolate CDC estimates to unreported neighborhoods with competitive accuracy (e.g., F1 around 0.73). The framework lays groundwork for real-time, neighborhood-focused MH surveillance and resource allocation guidance, while emphasizing privacy, ethical use, and reproducibility. Key contributions include the LocalTweets benchmark, the LocalHealth regression/prediction pipeline, and a detailed analysis of data availability and spatial extrapolation in population-level MH forecasting. Finally, the work offers practical implications for public health policy and suggests future expansions to other health domains and equitable resource distribution.

Abstract

Prior research on Twitter (now X) data has provided positive evidence of its utility in developing supplementary health surveillance systems. In this study, we present a new framework to surveil public health, focusing on mental health (MH) outcomes. We hypothesize that locally posted tweets are indicative of local MH outcomes and collect tweets posted from 765 neighborhoods (census block groups) in the USA. We pair these tweets from each neighborhood with the corresponding MH outcome reported by the Center for Disease Control (CDC) to create a benchmark dataset, LocalTweets. With LocalTweets, we present the first population-level evaluation task for Twitter-based MH surveillance systems. We then develop an efficient and effective method, LocalHealth, for predicting MH outcomes based on LocalTweets. When used with GPT3.5, LocalHealth achieves the highest F1-score and accuracy of 0.7429 and 79.78\%, respectively, a 59\% improvement in F1-score over the GPT3.5 in zero-shot setting. We also utilize LocalHealth to extrapolate CDC's estimates to proxy unreported neighborhoods, achieving an F1-score of 0.7291. Our work suggests that Twitter data can be effectively leveraged to simulate neighborhood-level MH outcomes.
Paper Structure (27 sections, 4 equations, 10 figures, 5 tables, 1 algorithm)

This paper contains 27 sections, 4 equations, 10 figures, 5 tables, 1 algorithm.

Figures (10)

  • Figure 1: Data Collection Process. In this figure, we present a simple schematic of our data curation process. First, we sample 1K neighborhoods (i.e., block groups or BGs) and curate a list of keywords for three categories of tweets, to form queries. Secondly, we query the CDC and Twitter databases to collect the desired data. Lastly, for each BG, we join the set of tweets posted from the BG with the reported health outcome from the CDC database. The final cleaned version of LocalTweets includes 765 unique BGs, spans over five years, and includes over 22 million tweets.
  • Figure 2: Distributional Properties of LocalTweets. (A): Region vs. Number of Tweets: MH tweets are the most numerous, while FI and general tweets have comparable volumes. Tweet volume is slightly skewed toward the South and West regions. (B) ADI vs. Number of Tweets: MH and FI tweets are slightly skewed toward ADIs $\geq 70$ and $\leq 20$. (C) Region vs. Number of BGs: The distribution of data splits over regions is approximately the same. The number of BGs from the Northeast region is less than other regions. Refer to Appendix \ref{['appendix:regional_dist']} for more discussion. (D): ADI vs. Number of BGs: The number of BGs is fairly balanced over the ADI values and across the data splits.
  • Figure 3: Effect of Data Availability on Prediction of Future Outcomes. In the figure, we present the effect of data availability (x-axis) on the prediction of future i.e., 2019 (all 765 BGs in LocalTweets), MH outcomes. We evaluate models on the correct identification of the BG risk category and plot the F1-score on the y-axis. The lines and shaded regions represent the average value and range of F1-scores, calculated over 10 seeds.
  • Figure 4: Effect of Data Availability on Prediction of Outcomes for Unreported Neighborhoods. In the figure, we present the effect of data availability (x-axis) on the prediction of MH outcomes for a set of proxy unreported BGs (test split in 2019, 320 BGs). We evaluate models on the correct identification of the BG risk category and plot the F1-score on the y-axis. The lines and shaded regions represent the average value and range of F1-scores, calculated over 10 seeds.
  • Figure 5: Longitudinal Trend of Target Variable. In this figure, we present the average values of MH outcomes for the years 2015 to 2019. The average value is calculated separately for each ADI value considered in the analysis. The increasing trend in the MH outcome values is observed across all BGs irrespective of their socio-economic status.
  • ...and 5 more figures