Hybrid Physics-ML Framework for Pan-Arctic Permafrost Infrastructure Risk at Record 2.9-Million Observation Scale
Boris Kriuk
TL;DR
This study tackles the challenge of quantifying permafrost infrastructure risk under climate change across the Arctic Russia by building a hybrid physics–machine learning framework. It combines a pan-Arctic dataset of $2.917285\times 10^6$ observations from $1.71605\times 10^5$ locations (2005–2021) with a stacked ensemble of RF, Histogram Gradient Boosting, and Elastic Net, complemented by physics-based adjustments to address extrapolation beyond historical conditions. The approach achieves high predictive accuracy (e.g., $R^2>0.97$ for tree-based models) and provides spatially explicit uncertainty maps and quantile-based risk classifications under three RCP scenarios, revealing substantial degradation under high-emission futures (mean declines up to ~$20.27$ percentage points under RCP8.5). By delivering an open-source, scalable forecasting system with rigorous spatiotemporal validation and uncertainty quantification, the framework supports engineering design codes, climate adaptation planning, and decision-making for Arctic infrastructure, while remaining generalizable to other permafrost regions.
Abstract
Arctic warming threatens over 100 billion in permafrost-dependent infrastructure across Northern territories, yet existing risk assessment frameworks lack spatiotemporal validation, uncertainty quantification, and operational decision-support capabilities. We present a hybrid physics-machine learning framework integrating 2.9 million observations from 171,605 locations (2005-2021) combining permafrost fraction data with climate reanalysis. Our stacked ensemble model (Random Forest + Histogram Gradient Boosting + Elastic Net) achieves R2=0.980 (RMSE=5.01 pp) with rigorous spatiotemporal cross-validation preventing data leakage. To address machine learning limitations in extrapolative climate scenarios, we develop a hybrid approach combining learned climate-permafrost relationships (60%) with physical permafrost sensitivity models (40%, -10 pp/C). Under RCP8.5 forcing (+5C over 10 years), we project mean permafrost fraction decline of -20.3 pp (median: -20.0 pp), with 51.5% of Arctic Russia experiencing over 20 percentage point loss. Infrastructure risk classification identifies 15% high-risk zones (25% medium-risk) with spatially explicit uncertainty maps. Our framework represents the largest validated permafrost ML dataset globally, provides the first operational hybrid physics-ML forecasting system for Arctic infrastructure, and delivers open-source tools enabling probabilistic permafrost projections for engineering design codes and climate adaptation planning. The methodology is generalizable to other permafrost regions and demonstrates how hybrid approaches can overcome pure data-driven limitations in climate change applications.
