Public Health in Disaster: Emotional Health and Life Incidents Extraction during Hurricane Harvey
Thomas Hoang, Quynh Anh Nguyen, Long Nguyen
TL;DR
This study investigates how Hurricane Harvey affected emotional health and daily life experiences as expressed on Twitter. It deploys a multi-stage pipeline that blends a BERT-based emotion classifier with LDA-based life-incident extraction, refines topic clustering through Graph Neural Network embeddings, and uses a Large Language Model to generate descriptive names for each incident cluster. The integration of GNNs with LDA and LLM-based naming enhances clustering accuracy and interpretability, enabling timely, actionable insights for disaster preparedness and response. By leveraging real-time social media signals, the work contributes a framework for monitoring emotional health and life incidents during disasters to inform public health interventions and emergency management strategies.
Abstract
Countless disasters have resulted from climate change, causing severe damage to infrastructure and the economy. These disasters have significant societal impacts, necessitating mental health services for the millions affected. To prepare for and respond effectively to such events, it is important to understand people's emotions and the life incidents they experience before and after a disaster strikes. In this case study, we collected a dataset of approximately 400,000 public tweets related to the storm. Using a BERT-based model, we predicted the emotions associated with each tweet. To efficiently identify these topics, we utilized the Latent Dirichlet Allocation (LDA) technique for topic modeling, which allowed us to bypass manual content analysis and extract meaningful patterns from the data. However, rather than stopping at topic identification like previous methods \cite{math11244910}, we further refined our analysis by integrating Graph Neural Networks (GNN) and Large Language Models (LLM). The GNN was employed to generate embeddings and construct a similarity graph of the tweets, which was then used to optimize clustering. Subsequently, we used an LLM to automatically generate descriptive names for each event cluster, offering critical insights for disaster preparedness and response strategies.
