Table of Contents
Fetching ...

Cluster-Segregate-Perturb (CSP): A Model-agnostic Explainability Pipeline for Spatiotemporal Land Surface Forecasting Models

Tushar Verma, Sudipan Saha

TL;DR

The paper tackles the challenge of explainability in spatiotemporal land-surface forecasting by proposing the Cluster-Segregate-Perturb (CSP) pipeline, a model-agnostic framework that combines clustering of meteorological time series, segment-level data segregation, and perturbation-based analyses. Implemented on a ConvLSTM trained with the EarthNet2021 dataset for NDVI prediction, CSP uses Soft-DTW-based time-series clustering to form weather segments and then perturbs variables within each segment to quantify marginal effects and nonlinear relationships, revealing that precipitation is the strongest driver of NDVI with notable nonlinear correlations and minimal direct pressure impact. The approach yields both local (segment-level) and global (weighted) insights, demonstrating the utility of segment-aware perturbations for interpretable analysis of high-dimensional spatiotemporal models. These contributions offer a robust, scalable path toward more transparent Earth observation forecasting with practical implications for climate and vegetation monitoring.

Abstract

Satellite images have become increasingly valuable for modelling regional climate change effects. Earth surface forecasting represents one such task that integrates satellite images with meteorological data to capture the joint evolution of regional climate change effects. However, understanding the complex relationship between specific meteorological variables and land surface evolution poses a significant challenge. In light of this challenge, our paper introduces a pipeline that integrates principles from both perturbation-based explainability techniques like LIME and global marginal explainability techniques like PDP, besides addressing the constraints of using such techniques when applying them to high-dimensional spatiotemporal deep models. The proposed pipeline simplifies the undertaking of diverse investigative analyses, such as marginal sensitivity analysis, marginal correlation analysis, lag analysis, etc., on complex land surface forecasting models In this study we utilised Convolutional Long Short-Term Memory (ConvLSTM) as the surface forecasting model and did analyses on the Normalized Difference Vegetation Index (NDVI) of the surface forecasts, since meteorological variables like temperature, pressure, and precipitation significantly influence it. The study area encompasses various regions in Europe. Our analyses show that precipitation exhibits the highest sensitivity in the study area, followed by temperature and pressure. Pressure has little to no direct effect on NDVI. Additionally, interesting nonlinear correlations between meteorological variables and NDVI have been uncovered.

Cluster-Segregate-Perturb (CSP): A Model-agnostic Explainability Pipeline for Spatiotemporal Land Surface Forecasting Models

TL;DR

The paper tackles the challenge of explainability in spatiotemporal land-surface forecasting by proposing the Cluster-Segregate-Perturb (CSP) pipeline, a model-agnostic framework that combines clustering of meteorological time series, segment-level data segregation, and perturbation-based analyses. Implemented on a ConvLSTM trained with the EarthNet2021 dataset for NDVI prediction, CSP uses Soft-DTW-based time-series clustering to form weather segments and then perturbs variables within each segment to quantify marginal effects and nonlinear relationships, revealing that precipitation is the strongest driver of NDVI with notable nonlinear correlations and minimal direct pressure impact. The approach yields both local (segment-level) and global (weighted) insights, demonstrating the utility of segment-aware perturbations for interpretable analysis of high-dimensional spatiotemporal models. These contributions offer a robust, scalable path toward more transparent Earth observation forecasting with practical implications for climate and vegetation monitoring.

Abstract

Satellite images have become increasingly valuable for modelling regional climate change effects. Earth surface forecasting represents one such task that integrates satellite images with meteorological data to capture the joint evolution of regional climate change effects. However, understanding the complex relationship between specific meteorological variables and land surface evolution poses a significant challenge. In light of this challenge, our paper introduces a pipeline that integrates principles from both perturbation-based explainability techniques like LIME and global marginal explainability techniques like PDP, besides addressing the constraints of using such techniques when applying them to high-dimensional spatiotemporal deep models. The proposed pipeline simplifies the undertaking of diverse investigative analyses, such as marginal sensitivity analysis, marginal correlation analysis, lag analysis, etc., on complex land surface forecasting models In this study we utilised Convolutional Long Short-Term Memory (ConvLSTM) as the surface forecasting model and did analyses on the Normalized Difference Vegetation Index (NDVI) of the surface forecasts, since meteorological variables like temperature, pressure, and precipitation significantly influence it. The study area encompasses various regions in Europe. Our analyses show that precipitation exhibits the highest sensitivity in the study area, followed by temperature and pressure. Pressure has little to no direct effect on NDVI. Additionally, interesting nonlinear correlations between meteorological variables and NDVI have been uncovered.
Paper Structure (20 sections, 19 equations, 4 figures, 2 tables)

This paper contains 20 sections, 19 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: We've identified $9$ unique clusters (representing base temporal patterns) for $precipitation$ and $pressure$, and $7$ unique clusters for $temperature$. The $frequency$ displayed on each plot indicates the number of data samples assigned to each cluster. In the pressure column the $2^{nd}$, $4^{th}$ and $8^{th}$ plots exhibited noise patterns in the pressure data. Therefore, we can safely disregard these clusters.
  • Figure 2: Illustrate average NDVI signals for different perturbations of the meteorological variables for a weather segment whose $c_i$ is (6,5,5,4,2), and frequency is 465 samples.
  • Figure 3: Correlation patterns between meteorological variables and NDVI of the predictions. Temperature is split into two curves to enhance visualization: lower NDVI scenes exhibit greater correlation curve curvature, which decreases as NDVI increases.
  • Figure 4: The figure highlights the curve fitting of the correlation graphs of a weather segment for different meteorological variables