Table of Contents
Fetching ...

SX-GeoTree: Self-eXplaining Geospatial Regression Tree Incorporating the Spatial Similarity of Feature Attributions

Chaogui Kang, Lijian Luo, Qingfeng Guan, Yu Liu

TL;DR

SX-GeoTree tackles the challenge of building interpretable geospatial models by embedding spatial similarity of feature attributions into a self-explaining regression tree. It jointly optimizes impurity reduction, spatial residual coherence via Moran's I, and explanation robustness via modularity on a consensus network built from GWR and SHAP distances, reframing local attribution stability as a network preservation problem. Empirical results on Fujian GDP and Seattle housing demonstrate competitive predictive accuracy with substantially improved spatial residual uniformity and attribution consensus, though multi-objective trade-offs can occur when spatial splits dominate. The approach offers a transferable framework for domain-aware explainability in geospatial ML and points to future work in soft/differentiable trees, bilevel optimization, and uncertainty-aware explanations.

Abstract

Decision trees remain central for tabular prediction but struggle with (i) capturing spatial dependence and (ii) producing locally stable (robust) explanations. We present SX-GeoTree, a self-explaining geospatial regression tree that integrates three coupled objectives during recursive splitting: impurity reduction (MSE), spatial residual control (global Moran's I), and explanation robustness via modularity maximization on a consensus similarity network formed from (a) geographically weighted regression (GWR) coefficient distances (stimulus-response similarity) and (b) SHAP attribution distances (explanatory similarity). We recast local Lipschitz continuity of feature attributions as a network community preservation problem, enabling scalable enforcement of spatially coherent explanations without per-sample neighborhood searches. Experiments on two exemplar tasks (county-level GDP in Fujian, n=83; point-wise housing prices in Seattle, n=21,613) show SX-GeoTree maintains competitive predictive accuracy (within 0.01 $R^{2}$ of decision trees) while improving residual spatial evenness and doubling attribution consensus (modularity: Fujian 0.19 vs 0.09; Seattle 0.10 vs 0.05). Ablation confirms Moran's I and modularity terms are complementary; removing either degrades both spatial residual structure and explanation stability. The framework demonstrates how spatial similarity - extended beyond geometric proximity through GWR-derived local relationships - can be embedded in interpretable models, advancing trustworthy geospatial machine learning and offering a transferable template for domain-aware explainability.

SX-GeoTree: Self-eXplaining Geospatial Regression Tree Incorporating the Spatial Similarity of Feature Attributions

TL;DR

SX-GeoTree tackles the challenge of building interpretable geospatial models by embedding spatial similarity of feature attributions into a self-explaining regression tree. It jointly optimizes impurity reduction, spatial residual coherence via Moran's I, and explanation robustness via modularity on a consensus network built from GWR and SHAP distances, reframing local attribution stability as a network preservation problem. Empirical results on Fujian GDP and Seattle housing demonstrate competitive predictive accuracy with substantially improved spatial residual uniformity and attribution consensus, though multi-objective trade-offs can occur when spatial splits dominate. The approach offers a transferable framework for domain-aware explainability in geospatial ML and points to future work in soft/differentiable trees, bilevel optimization, and uncertainty-aware explanations.

Abstract

Decision trees remain central for tabular prediction but struggle with (i) capturing spatial dependence and (ii) producing locally stable (robust) explanations. We present SX-GeoTree, a self-explaining geospatial regression tree that integrates three coupled objectives during recursive splitting: impurity reduction (MSE), spatial residual control (global Moran's I), and explanation robustness via modularity maximization on a consensus similarity network formed from (a) geographically weighted regression (GWR) coefficient distances (stimulus-response similarity) and (b) SHAP attribution distances (explanatory similarity). We recast local Lipschitz continuity of feature attributions as a network community preservation problem, enabling scalable enforcement of spatially coherent explanations without per-sample neighborhood searches. Experiments on two exemplar tasks (county-level GDP in Fujian, n=83; point-wise housing prices in Seattle, n=21,613) show SX-GeoTree maintains competitive predictive accuracy (within 0.01 of decision trees) while improving residual spatial evenness and doubling attribution consensus (modularity: Fujian 0.19 vs 0.09; Seattle 0.10 vs 0.05). Ablation confirms Moran's I and modularity terms are complementary; removing either degrades both spatial residual structure and explanation stability. The framework demonstrates how spatial similarity - extended beyond geometric proximity through GWR-derived local relationships - can be embedded in interpretable models, advancing trustworthy geospatial machine learning and offering a transferable template for domain-aware explainability.

Paper Structure

This paper contains 22 sections, 4 equations, 9 figures, 12 tables.

Figures (9)

  • Figure 1: The multivariate decision tree splits in GeoTrees
  • Figure 2: Illustration of the feature contribution and expected value estimation process using Tree SHAP. Note that $v_{e}$ is the expected values given the known features.
  • Figure 3: The proposed SX-GeoTree model
  • Figure 4: The construction and partition of the spatial similarity network
  • Figure 5: The estimated SHAP scores for locational features
  • ...and 4 more figures