Table of Contents
Fetching ...

Landslide Hazard Mapping with Geospatial Foundation Models: Geographical Generalizability, Data Scarcity, and Band Adaptability

Wenwen Li, Sizhe Wang, Hyunho Lee, Chenyan Lu, Sujit Roy, Rahul Ramachandran, Chia-Yu Hsu

TL;DR

This study introduces a three-axis framework—sensor, label, and domain—to evaluate geospatial foundation models (GeoFMs) for landslide mapping, focusing on Prithvi-EO-2.0. Through extensive experiments on Landslide4Sense and cross-dataset tests with Landslide Reference and GVLM-S2, Prithvi-EO-2.0 demonstrates superior band adaptability, data efficiency, and geographic generalizability compared to task-specific CNNs and other GeoFMs. The results show EO-native pretraining enables rapid adaptation to varied spectral inputs, performs robustly under label scarcity, and transfers more reliably across regions, albeit with higher computational demands and data-reuse challenges. Overall, GeoFMs like Prithvi-EO-2.0 offer a scalable, transferable approach to proactive landslide risk monitoring and environmental management, with future work aimed at reducing compute needs and enhancing domain adaptation via methods such as visual prompt tuning and integrated pre/post-disaster data.

Abstract

Landslides cause severe damage to lives, infrastructure, and the environment, making accurate and timely mapping essential for disaster preparedness and response. However, conventional deep learning models often struggle when applied across different sensors, regions, or under conditions of limited training data. To address these challenges, we present a three-axis analytical framework of sensor, label, and domain for adapting geospatial foundation models (GeoFMs), focusing on Prithvi-EO-2.0 for landslide mapping. Through a series of experiments, we show that it consistently outperforms task-specific CNNs (U-Net, U-Net++), vision transformers (Segformer, SwinV2-B), and other GeoFMs (TerraMind, SatMAE). The model, built on global pretraining, self-supervision, and adaptable fine-tuning, proved resilient to spectral variation, maintained accuracy under label scarcity, and generalized more reliably across diverse datasets and geographic settings. Alongside these strengths, we also highlight remaining challenges such as computational cost and the limited availability of reusable AI-ready training data for landslide research. Overall, our study positions GeoFMs as a step toward more robust and scalable approaches for landslide risk reduction and environmental monitoring.

Landslide Hazard Mapping with Geospatial Foundation Models: Geographical Generalizability, Data Scarcity, and Band Adaptability

TL;DR

This study introduces a three-axis framework—sensor, label, and domain—to evaluate geospatial foundation models (GeoFMs) for landslide mapping, focusing on Prithvi-EO-2.0. Through extensive experiments on Landslide4Sense and cross-dataset tests with Landslide Reference and GVLM-S2, Prithvi-EO-2.0 demonstrates superior band adaptability, data efficiency, and geographic generalizability compared to task-specific CNNs and other GeoFMs. The results show EO-native pretraining enables rapid adaptation to varied spectral inputs, performs robustly under label scarcity, and transfers more reliably across regions, albeit with higher computational demands and data-reuse challenges. Overall, GeoFMs like Prithvi-EO-2.0 offer a scalable, transferable approach to proactive landslide risk monitoring and environmental management, with future work aimed at reducing compute needs and enhancing domain adaptation via methods such as visual prompt tuning and integrated pre/post-disaster data.

Abstract

Landslides cause severe damage to lives, infrastructure, and the environment, making accurate and timely mapping essential for disaster preparedness and response. However, conventional deep learning models often struggle when applied across different sensors, regions, or under conditions of limited training data. To address these challenges, we present a three-axis analytical framework of sensor, label, and domain for adapting geospatial foundation models (GeoFMs), focusing on Prithvi-EO-2.0 for landslide mapping. Through a series of experiments, we show that it consistently outperforms task-specific CNNs (U-Net, U-Net++), vision transformers (Segformer, SwinV2-B), and other GeoFMs (TerraMind, SatMAE). The model, built on global pretraining, self-supervision, and adaptable fine-tuning, proved resilient to spectral variation, maintained accuracy under label scarcity, and generalized more reliably across diverse datasets and geographic settings. Alongside these strengths, we also highlight remaining challenges such as computational cost and the limited availability of reusable AI-ready training data for landslide research. Overall, our study positions GeoFMs as a step toward more robust and scalable approaches for landslide risk reduction and environmental monitoring.

Paper Structure

This paper contains 21 sections, 8 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Conceptual framework illustrating the evolution of landslide mapping techniques and their role in breaking the cycle of vulnerability. Top panel: transition from traditional expert-driven analysis, to task-specific deep learning, to geospatial foundation models (GeoFMs) that leverage self-supervised learning and fine-tuning for downstream tasks. The green arrow represents the technological shift toward scalable, adaptable, and data-efficient solutions. Bottom panel: the cycle of vulnerability, where climate change and human activities drive landslides, leading to impacts that increase community vulnerability. By enabling multi-band adaptability, few-shot capability, and transferable knowledge, GeoFMs provide a pathway to proactive risk management and aim to disrupt this reinforcing cycle.
  • Figure 2: Overview of the Landslide4Sense benchmark dataset. (a) Study regions spanning diverse climatic, geological, and geomorphic settings. These sites capture landslide activity triggered by both earthquakes and extreme rainfall. (b) Class distribution of landslide versus background pixels, highlighting the extreme class imbalance: landslides constitute only 2.2% of the total labeled area, and most image patches contain fewer than 10% landslide pixels. (c) Example multi-sensor inputs, including optical (RGB), infrared (NIR, SWIR), and topographic (DEM) data, alongside the ground truth mask. In the released dataset, all channels, including native 20 m/60 m Sentinel-2 bands and the topographic layers, are resampled to 10 m/pixel.
  • Figure 3: Overview of the Prithvi-EO-2.0 architecture and fine-tuning framework. (a) Pretraining stage (MAE): Multi-band EO imagery is divided into non-overlapping patches, and a subset is masked. A transformer encoder with an MAE reconstruction decoder is trained to reconstruct the missing patches by minimizing mean squared error (MSE). (b) Downstream fine-tuning stage (semantic segmentation): The pretrained encoder is paired with a lightweight convolutional decoder (Conv2D and ConvTranspose2D layers) to produce per-pixel landslide segmentation masks. The decoder is trained with imbalance-aware loss functions, including weighted cross-entropy (wCE), Lovász loss, and focal loss.
  • Figure 4: Adapter strategies for band alignment prior to the pretrained Prithvi encoder. Adapter 1: Linear Projection applies a per-pixel affine mapping to project from $B_{\mathrm{in}}$ to the six-band interface $B_{\mathrm{pre}}{=}6$. Adapter 2: U-Net Encoder Head uses a shallow convolutional encoder to capture local spatial context. The projected input $\mathbf{X}'\!\in\!\mathbb{R}^{H\times W\times 6}$ is passed to the Prithvi backbone $F_{\theta}$ (initialized from pretraining) and an lightweight FCN decoder $G_{\phi}$; softmax $\sigma$ applied to logits $\mathbf{S}$ yields the prediction $\hat{\mathbf{Y}}{=}\sigma(\mathbf{S})$. Inputs follow the spectral configurations in Table \ref{['tab:band_configs']}.
  • Figure 5: Sites used in the cross-dataset generalization study. Blue/orange: Landslide Reference train/val and test regions; green: Landslide Reference generalizability sites; red diamonds: GVLM-S2 external sites.
  • ...and 5 more figures