Table of Contents
Fetching ...

MASH: A Multiplatform and Multimodal Annotated Dataset for Societal Impact of Hurricane

Ruichen Yao, Aslanbek Murzakhmetov, Raaghav Pillai, Aliya Maussymbayeva, Zelin Li, Yifan Liu, Yaokun Liu, Lanyu Shang, Yang Zhang, Na Wei, Ximing Cai, Dong Wang

TL;DR

The first large-scale, multi-platform, multimodal, and multi-dimensionally annotated dataset centered on hurricane disasters, MASH is presented and an online platform is introduced that supports interactive data exploration, provides preliminary analytical results, and allows users to share their insights regarding the societal impacts of hurricanes.

Abstract

Natural disasters cause multidimensional threats to human societies, with hurricanes exemplifying one of the most disruptive events that not only caused severe physical damage but also sparked widespread discussion on social media platforms. Existing datasets for studying societal impacts of hurricanes often focus on outdated hurricanes and are limited to a single social media platform, failing to capture the broader societal impact in today's diverse social media environment. Moreover, existing datasets annotate visual and textual content of the post separately, failing to account for the multimodal nature of social media posts. To address these gaps, we present a multiplatform and Multimodal Annotated Dataset for Societal Impact of Hurricane (MASH) that includes 59,607 relevant social media data posts from Reddit, TikTok, and YouTube. In addition, all relevant social media data posts are annotated in a multimodal approach that considers both textual and visual content on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes. To our best knowledge, MASH is the first large-scale, multi-platform, multimodal, and multi-dimensionally annotated dataset centered on hurricane disasters. In addition, we introduce an online platform that supports interactive data exploration, provides preliminary analytical results, and allows users to share their insights regarding the societal impacts of hurricanes. We envision that MASH can contribute to the study of hurricanes' impact on society, such as disaster response, disaster severity classification, public sentiment analysis, disaster policy making, and bias identification. The dataset is publicly available at https://huggingface.co/datasets/YRC10/MASH under the Creative Commons Attribution 4.0 (CC BY 4.0) license.

MASH: A Multiplatform and Multimodal Annotated Dataset for Societal Impact of Hurricane

TL;DR

The first large-scale, multi-platform, multimodal, and multi-dimensionally annotated dataset centered on hurricane disasters, MASH is presented and an online platform is introduced that supports interactive data exploration, provides preliminary analytical results, and allows users to share their insights regarding the societal impacts of hurricanes.

Abstract

Natural disasters cause multidimensional threats to human societies, with hurricanes exemplifying one of the most disruptive events that not only caused severe physical damage but also sparked widespread discussion on social media platforms. Existing datasets for studying societal impacts of hurricanes often focus on outdated hurricanes and are limited to a single social media platform, failing to capture the broader societal impact in today's diverse social media environment. Moreover, existing datasets annotate visual and textual content of the post separately, failing to account for the multimodal nature of social media posts. To address these gaps, we present a multiplatform and Multimodal Annotated Dataset for Societal Impact of Hurricane (MASH) that includes 59,607 relevant social media data posts from Reddit, TikTok, and YouTube. In addition, all relevant social media data posts are annotated in a multimodal approach that considers both textual and visual content on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes. To our best knowledge, MASH is the first large-scale, multi-platform, multimodal, and multi-dimensionally annotated dataset centered on hurricane disasters. In addition, we introduce an online platform that supports interactive data exploration, provides preliminary analytical results, and allows users to share their insights regarding the societal impacts of hurricanes. We envision that MASH can contribute to the study of hurricanes' impact on society, such as disaster response, disaster severity classification, public sentiment analysis, disaster policy making, and bias identification. The dataset is publicly available at https://huggingface.co/datasets/YRC10/MASH under the Creative Commons Attribution 4.0 (CC BY 4.0) license.

Paper Structure

This paper contains 19 sections, 10 figures, 10 tables.

Figures (10)

  • Figure 1: MASH Dataset and Online Platform
  • Figure 2: Humanitarian Class vs. Bias Class Correlation
  • Figure 3: False Info vs. Humanitarian and Bias Classes.
  • Figure 4: Temporal Trend of Humanitarian, Bias, and Information Integrity Classes
  • Figure 5: Spatial Distribution of Posts Across U.S. States
  • ...and 5 more figures