An Interpretable Implicit-Based Approach for Modeling Local Spatial Effects: A Case Study of Global Gross Primary Productivity
Siqi Du, Hongsheng Huang, Kaixin Shen, Ziqi Liu, Shengjun Tang
TL;DR
The paper tackles the challenge of spatiotemporal heterogeneity in geographic ML by separating location-invariant laws from location-specific differences. It introduces a dual-branch encoder–decoder that combines a spatiotemporal conditional graph with GCN/LSTM to encode implicit location-specific conditions and a self-attention-based encoder to extract shared patterns, followed by a Transformer-based decoder with cross-attention to predict targets and estimate interpretable feature weights. Validated on Climate2GPP for global GPP prediction (2001–2020) using ERA5-Land, MODIS Land Cover, and PML_V2 data, the method achieves a RMSE of $0.836$ and $R^2=0.932$, outperforming LightGBM Large, TabNet, and GWR baselines. Visualization reveals time- and space-varying dominance of factors across regions, demonstrating both strong predictive performance and interpretable insights with potential for broader geographic ML applications.
Abstract
In Earth sciences, unobserved factors exhibit non-stationary spatial distributions, causing the relationships between features and targets to display spatial heterogeneity. In geographic machine learning tasks, conventional statistical learning methods often struggle to capture spatial heterogeneity, leading to unsatisfactory prediction accuracy and unreliable interpretability. While approaches like Geographically Weighted Regression (GWR) capture local variations, they fall short of uncovering global patterns and tracking the continuous evolution of spatial heterogeneity. Motivated by this limitation, we propose a novel perspective - that is, simultaneously modeling common features across different locations alongside spatial differences using deep neural networks. The proposed method is a dual-branch neural network with an encoder-decoder structure. In the encoding stage, the method aggregates node information in a spatiotemporal conditional graph using GCN and LSTM, encoding location-specific spatiotemporal heterogeneity as an implicit conditional vector. Additionally, a self-attention-based encoder is used to extract location-invariant common features from the data. In the decoding stage, the approach employs a conditional generation strategy that predicts response variables and interpretative weights based on data features under spatiotemporal conditions. The approach is validated by predicting vegetation gross primary productivity (GPP) using global climate and land cover data from 2001 to 2020. Trained on 50 million samples and tested on 2.8 million, the proposed model achieves an RMSE of 0.836, outperforming LightGBM (1.063) and TabNet (0.944). Visualization analyses indicate that our method can reveal the distribution differences of the dominant factors of GPP across various times and locations.
