Extracting Disaster Impacts and Impact Related Locations in Social Media Posts Using Large Language Models
Sameeah Noreen Hameed, Surangika Ranathunga, Raj Prasanna, Kristin Stock, Christopher B. Jones
TL;DR
The paper addresses the need for timely, fine-grained disaster situational awareness by extracting both impacts and directly impacted locations from social media posts. It introduces the Disaster Impacted Location Corpus (DILC) built from IDRISI-DE and evaluates a range of methods from traditional NER to large-language models, with a focus on prompt design and post-processing to reduce hallucinations. Findings show that LLMs, especially Llama-3 variants with carefully crafted prompts and post-processing, outperform baselines in all-locations extraction and that disaster-specific fine-tuning improves impact extraction, though not uniformly for all tasks. This work offers a scalable approach to transform unstructured social-media content into actionable geospatial information for disaster response, with clear directions for multilingual expansion and richer geocoding in future work.
Abstract
Large-scale disasters can often result in catastrophic consequences on people and infrastructure. Situation awareness about such disaster impacts generated by authoritative data from in-situ sensors, remote sensing imagery, and/or geographic data is often limited due to atmospheric opacity, satellite revisits, and time limitations. This often results in geo-temporal information gaps. In contrast, impact-related social media posts can act as "geo-sensors" during a disaster, where people describe specific impacts and locations. However, not all locations mentioned in disaster-related social media posts relate to an impact. Only the impacted locations are critical for directing resources effectively. e.g., "The death toll from a fire which ripped through the Greek coastal town of #Mati stood at 80, with dozens of people unaccounted for as forensic experts tried to identify victims who were burned alive #Greecefires #AthensFires #Athens #Greece." contains impacted location "Mati" and non-impacted locations "Greece" and "Athens". This research uses Large Language Models (LLMs) to identify all locations, impacts and impacted locations mentioned in disaster-related social media posts. In the process, LLMs are fine-tuned to identify only impacts and impacted locations (as distinct from other, non-impacted locations), including locations mentioned in informal expressions, abbreviations, and short forms. Our fine-tuned model demonstrates efficacy, achieving an F1-score of 0.69 for impact and 0.74 for impacted location extraction, substantially outperforming the pre-trained baseline. These robust results confirm the potential of fine-tuned language models to offer a scalable solution for timely decision-making in resource allocation, situational awareness, and post-disaster recovery planning for responders.
