Table of Contents
Fetching ...

Gemini & Physical World: Large Language Models Can Estimate the Intensity of Earthquake Shaking from Multi-Modal Social Media Posts

S. Mostafa Mousavi, Marc Stogaitis, Tajinder Gadh, Richard M Allen, Alexei Barski, Robert Bosch, Patrick Robertson, Nivetha Thiruverahan, Youngmin Cho, Aman Raj

TL;DR

This paper presents a novel approach to extract scientifically valuable information about Earth's physical phenomena from unconventional sources, such as multi-modal social media posts, using a state-of-the-art large language model, Gemini 1.5 Pro, to estimate earthquake ground shaking intensity from these unstructured posts.

Abstract

This paper presents a novel approach to extract scientifically valuable information about Earth's physical phenomena from unconventional sources, such as multi-modal social media posts. Employing a state-of-the-art large language model (LLM), Gemini 1.5 Pro (Reid et al. 2024), we estimate earthquake ground shaking intensity from these unstructured posts. The model's output, in the form of Modified Mercalli Intensity (MMI) values, aligns well with independent observational data. Furthermore, our results suggest that LLMs, trained on vast internet data, may have developed a unique understanding of physical phenomena. Specifically, Google's Gemini models demonstrate a simplified understanding of the general relationship between earthquake magnitude, distance, and MMI intensity, accurately describing observational data even though it's not identical to established models. These findings raise intriguing questions about the extent to which Gemini's training has led to a broader understanding of the physical world and its phenomena. The ability of Generative AI models like Gemini to generate results consistent with established scientific knowledge highlights their potential to augment our understanding of complex physical phenomena like earthquakes. The flexible and effective approach proposed in this study holds immense potential for enriching our understanding of the impact of physical phenomena and improving resilience during natural disasters. This research is a significant step toward harnessing the power of social media and AI for natural disaster mitigation, opening new avenues for understanding the emerging capabilities of Generative AI and LLMs for scientific applications.

Gemini & Physical World: Large Language Models Can Estimate the Intensity of Earthquake Shaking from Multi-Modal Social Media Posts

TL;DR

This paper presents a novel approach to extract scientifically valuable information about Earth's physical phenomena from unconventional sources, such as multi-modal social media posts, using a state-of-the-art large language model, Gemini 1.5 Pro, to estimate earthquake ground shaking intensity from these unstructured posts.

Abstract

This paper presents a novel approach to extract scientifically valuable information about Earth's physical phenomena from unconventional sources, such as multi-modal social media posts. Employing a state-of-the-art large language model (LLM), Gemini 1.5 Pro (Reid et al. 2024), we estimate earthquake ground shaking intensity from these unstructured posts. The model's output, in the form of Modified Mercalli Intensity (MMI) values, aligns well with independent observational data. Furthermore, our results suggest that LLMs, trained on vast internet data, may have developed a unique understanding of physical phenomena. Specifically, Google's Gemini models demonstrate a simplified understanding of the general relationship between earthquake magnitude, distance, and MMI intensity, accurately describing observational data even though it's not identical to established models. These findings raise intriguing questions about the extent to which Gemini's training has led to a broader understanding of the physical world and its phenomena. The ability of Generative AI models like Gemini to generate results consistent with established scientific knowledge highlights their potential to augment our understanding of complex physical phenomena like earthquakes. The flexible and effective approach proposed in this study holds immense potential for enriching our understanding of the impact of physical phenomena and improving resilience during natural disasters. This research is a significant step toward harnessing the power of social media and AI for natural disaster mitigation, opening new avenues for understanding the emerging capabilities of Generative AI and LLMs for scientific applications.
Paper Structure (5 sections, 9 figures, 1 table)

This paper contains 5 sections, 9 figures, 1 table.

Figures (9)

  • Figure 1: showcases the diversity of the dataset used in this study, which comprises screenshots and screen recordings of social media posts documenting individual (e.g., a, d, e and f) or group (e.g., b) experiences of earthquake shaking, as well as the responses of animals (e.g., h). These posts encompass a range of formats, including images (b) and videos (d to k) containing textual information, presented in various languages (e.g. c), sizes/durations, and background settings. The video content spans both indoor (g, h, and i) and outdoor (j, k, and l) environments. Indoor videos primarily consist of CCTV footage capturing the moment of earthquake shaking, while outdoor videos include similar CCTV footage as well as recordings of infrastructure damage. Additionally, the dataset incorporates post-earthquake narrative videos where individuals describe their personal experiences and observations during the earthquake (e.g. d to f).
  • Figure 2: showcases a sample of Gemini's output for a social media post. This post documents the ground shaking experienced in Boonton, NJ, located approximately 37.6 km from the epicenter of the M4.8 earthquake that occurred on April 5, 2024, in Tewksbury, New Jersey, USA.
  • Figure 3: presents a comparison of estimated Modified Mercalli Intensities (MMIs) from the Gemini model (dark blue circles with error bars) against several sources of observed data. These include: (1) instrumentally-derived MMIs computed from peak ground acceleration (PGA) recorded by seismic stations; (2) "Did You Feel It?" (DYFI) macroseismic data collected by the USGS; and (3) the expected ground motion attenuation model for rock sites on the East Coast, along with its ±1 standard deviation range. Panels (a) and (b) display the results for the New Jersey and Oklahoma earthquakes, respectively. Note that the distance scale on the x-axis is logarithmic.
  • Figure 4: Histograms illustrating the distribution of earthquake intensity with respect to epicentral distance for the New Jersey (a) and Oklahoma (b) earthquakes, utilizing USGS DYFI data. Each panel presents a 2D histogram, while its margins display 1D histograms with counts on the axes. The blue circles represent the estimated mean Modified Mercalli Intensity (MMI) values for individual social media posts, as determined by the Gemini model.
  • Figure 5: Comparison of Gemini’s estimate with (a) and without (b) epicentral distance and earthquake magnitude in the prompt. Each boxplot with whiskers illustrates the distribution, quartiles, and outliers of reported Did You Feel It? (DYFI) data for individual earthquakes within specific zip codes of a city. Circle markers represent the estimated mean MMI values derived from each social media post, within the same city, analyzed by the Gemini model for each event.
  • ...and 4 more figures