Table of Contents
Fetching ...

Extracting Disaster Impacts and Impact Related Locations in Social Media Posts Using Large Language Models

Sameeah Noreen Hameed, Surangika Ranathunga, Raj Prasanna, Kristin Stock, Christopher B. Jones

TL;DR

The paper addresses the need for timely, fine-grained disaster situational awareness by extracting both impacts and directly impacted locations from social media posts. It introduces the Disaster Impacted Location Corpus (DILC) built from IDRISI-DE and evaluates a range of methods from traditional NER to large-language models, with a focus on prompt design and post-processing to reduce hallucinations. Findings show that LLMs, especially Llama-3 variants with carefully crafted prompts and post-processing, outperform baselines in all-locations extraction and that disaster-specific fine-tuning improves impact extraction, though not uniformly for all tasks. This work offers a scalable approach to transform unstructured social-media content into actionable geospatial information for disaster response, with clear directions for multilingual expansion and richer geocoding in future work.

Abstract

Large-scale disasters can often result in catastrophic consequences on people and infrastructure. Situation awareness about such disaster impacts generated by authoritative data from in-situ sensors, remote sensing imagery, and/or geographic data is often limited due to atmospheric opacity, satellite revisits, and time limitations. This often results in geo-temporal information gaps. In contrast, impact-related social media posts can act as "geo-sensors" during a disaster, where people describe specific impacts and locations. However, not all locations mentioned in disaster-related social media posts relate to an impact. Only the impacted locations are critical for directing resources effectively. e.g., "The death toll from a fire which ripped through the Greek coastal town of #Mati stood at 80, with dozens of people unaccounted for as forensic experts tried to identify victims who were burned alive #Greecefires #AthensFires #Athens #Greece." contains impacted location "Mati" and non-impacted locations "Greece" and "Athens". This research uses Large Language Models (LLMs) to identify all locations, impacts and impacted locations mentioned in disaster-related social media posts. In the process, LLMs are fine-tuned to identify only impacts and impacted locations (as distinct from other, non-impacted locations), including locations mentioned in informal expressions, abbreviations, and short forms. Our fine-tuned model demonstrates efficacy, achieving an F1-score of 0.69 for impact and 0.74 for impacted location extraction, substantially outperforming the pre-trained baseline. These robust results confirm the potential of fine-tuned language models to offer a scalable solution for timely decision-making in resource allocation, situational awareness, and post-disaster recovery planning for responders.

Extracting Disaster Impacts and Impact Related Locations in Social Media Posts Using Large Language Models

TL;DR

The paper addresses the need for timely, fine-grained disaster situational awareness by extracting both impacts and directly impacted locations from social media posts. It introduces the Disaster Impacted Location Corpus (DILC) built from IDRISI-DE and evaluates a range of methods from traditional NER to large-language models, with a focus on prompt design and post-processing to reduce hallucinations. Findings show that LLMs, especially Llama-3 variants with carefully crafted prompts and post-processing, outperform baselines in all-locations extraction and that disaster-specific fine-tuning improves impact extraction, though not uniformly for all tasks. This work offers a scalable approach to transform unstructured social-media content into actionable geospatial information for disaster response, with clear directions for multilingual expansion and richer geocoding in future work.

Abstract

Large-scale disasters can often result in catastrophic consequences on people and infrastructure. Situation awareness about such disaster impacts generated by authoritative data from in-situ sensors, remote sensing imagery, and/or geographic data is often limited due to atmospheric opacity, satellite revisits, and time limitations. This often results in geo-temporal information gaps. In contrast, impact-related social media posts can act as "geo-sensors" during a disaster, where people describe specific impacts and locations. However, not all locations mentioned in disaster-related social media posts relate to an impact. Only the impacted locations are critical for directing resources effectively. e.g., "The death toll from a fire which ripped through the Greek coastal town of #Mati stood at 80, with dozens of people unaccounted for as forensic experts tried to identify victims who were burned alive #Greecefires #AthensFires #Athens #Greece." contains impacted location "Mati" and non-impacted locations "Greece" and "Athens". This research uses Large Language Models (LLMs) to identify all locations, impacts and impacted locations mentioned in disaster-related social media posts. In the process, LLMs are fine-tuned to identify only impacts and impacted locations (as distinct from other, non-impacted locations), including locations mentioned in informal expressions, abbreviations, and short forms. Our fine-tuned model demonstrates efficacy, achieving an F1-score of 0.69 for impact and 0.74 for impacted location extraction, substantially outperforming the pre-trained baseline. These robust results confirm the potential of fine-tuned language models to offer a scalable solution for timely decision-making in resource allocation, situational awareness, and post-disaster recovery planning for responders.

Paper Structure

This paper contains 22 sections, 3 figures, 8 tables.

Figures (3)

  • Figure 1: LLM-based real-time disaster impact mapping from social media data. The process filters non-impact locations to help emergency managers target and prioritise response efforts.
  • Figure 2: Workflow for fine-tuning of LLMs
  • Figure 3: Post-Processing of LLMs response