Table of Contents
Fetching ...

Probabilistic Wildfire Susceptibility from Remote Sensing Using Random Forests and SHAP

Udaya Bhasker Cheerala, Varun Teja Chirukuri, Venkata Akhil Kumar Gummadi, Jintu Moni Bhuyan, Praveen Damacharla

TL;DR

The paper addresses California's wildfire risk by developing a probabilistic susceptibility map using Random Forests augmented with SHAP explainability. It combines diverse remote-sensing and geospatial data within Google Earth Engine and validates predictions through spatial transfer and temporal splits, revealing strong forest and grassland discrimination and ecosystem-specific drivers. SHAP analysis identifies soil organic carbon, tree cover, and NDVI as key forest drivers, and LST, elevation, and vegetation indices as dominant for grasslands, enabling interpretable risk mapping and targeted management. The work demonstrates robust performance and actionable insights, offering a scalable framework for regional fire risk assessment and decision support under changing climate and land-use conditions.

Abstract

Wildfires pose a significant global threat to ecosystems worldwide, with California experiencing recurring fires due to various factors, including climate, topographical features, vegetation patterns, and human activities. This study aims to develop a comprehensive wildfire risk map for California by applying the random forest (RF) algorithm, augmented with Explainable Artificial Intelligence (XAI) through Shapley Additive exPlanations (SHAP), to interpret model predictions. Model performance was assessed using both spatial and temporal validation strategies. The RF model demonstrated strong predictive performance, achieving near-perfect discrimination for grasslands (AUC = 0.996) and forests (AUC = 0.997). Spatial cross-validation revealed moderate transferability, yielding ROC-AUC values of 0.6155 for forests and 0.5416 for grasslands. In contrast, temporal split validation showed enhanced generalization, especially for forests (ROC-AUC = 0.6615, PR-AUC = 0.8423). SHAP-based XAI analysis identified key ecosystem-specific drivers: soil organic carbon, tree cover, and Normalized Difference Vegetation Index (NDVI) emerged as the most influential in forests, whereas Land Surface Temperature (LST), elevation, and vegetation health indices were dominant in grasslands. District-level classification revealed that Central Valley and Northern Buttes districts had the highest concentration of high-risk grasslands, while Northern Buttes and North Coast Redwoods dominated forested high-risk areas. This RF-SHAP framework offers a robust, comprehensible, and adaptable method for assessing wildfire risks, enabling informed decisions and creating targeted strategies to mitigate dangers.

Probabilistic Wildfire Susceptibility from Remote Sensing Using Random Forests and SHAP

TL;DR

The paper addresses California's wildfire risk by developing a probabilistic susceptibility map using Random Forests augmented with SHAP explainability. It combines diverse remote-sensing and geospatial data within Google Earth Engine and validates predictions through spatial transfer and temporal splits, revealing strong forest and grassland discrimination and ecosystem-specific drivers. SHAP analysis identifies soil organic carbon, tree cover, and NDVI as key forest drivers, and LST, elevation, and vegetation indices as dominant for grasslands, enabling interpretable risk mapping and targeted management. The work demonstrates robust performance and actionable insights, offering a scalable framework for regional fire risk assessment and decision support under changing climate and land-use conditions.

Abstract

Wildfires pose a significant global threat to ecosystems worldwide, with California experiencing recurring fires due to various factors, including climate, topographical features, vegetation patterns, and human activities. This study aims to develop a comprehensive wildfire risk map for California by applying the random forest (RF) algorithm, augmented with Explainable Artificial Intelligence (XAI) through Shapley Additive exPlanations (SHAP), to interpret model predictions. Model performance was assessed using both spatial and temporal validation strategies. The RF model demonstrated strong predictive performance, achieving near-perfect discrimination for grasslands (AUC = 0.996) and forests (AUC = 0.997). Spatial cross-validation revealed moderate transferability, yielding ROC-AUC values of 0.6155 for forests and 0.5416 for grasslands. In contrast, temporal split validation showed enhanced generalization, especially for forests (ROC-AUC = 0.6615, PR-AUC = 0.8423). SHAP-based XAI analysis identified key ecosystem-specific drivers: soil organic carbon, tree cover, and Normalized Difference Vegetation Index (NDVI) emerged as the most influential in forests, whereas Land Surface Temperature (LST), elevation, and vegetation health indices were dominant in grasslands. District-level classification revealed that Central Valley and Northern Buttes districts had the highest concentration of high-risk grasslands, while Northern Buttes and North Coast Redwoods dominated forested high-risk areas. This RF-SHAP framework offers a robust, comprehensible, and adaptable method for assessing wildfire risks, enabling informed decisions and creating targeted strategies to mitigate dangers.

Paper Structure

This paper contains 10 sections, 17 figures, 2 tables.

Figures (17)

  • Figure 1: Study area in California showing forests, grasslands, and fire perimeters. (Source: NLCD)
  • Figure 2: Proposed framework for remote sensing wildfire susceptibility using RF and SHAP.
  • Figure 3: Confusion matrix of the RF classifier for forest and grassland datasets.
  • Figure 4: ROC curves with bootstrapped confidence intervals for the RF classifier applied to forest and grassland datasets.
  • Figure 5: PR curves with confidence intervals for the RF classifier applied to forest and grassland datasets.
  • ...and 12 more figures