ST-GRIT: Spatio-Temporal Graph Transformer For Internal Ice Layer Thickness Prediction
Zesheng Liu, Maryam Rahnemoonfar
TL;DR
This paper tackles the challenge of predicting the thickness of deeper internal ice layers from radargrams, a task vital for climate modeling but complicated by noise and varying layer counts. It introduces ST-GRIT, a spatio-temporal graph transformer that fuses GraphSAGE-based spatial embeddings with separate spatial and temporal attention to learn from sequences of ice-layer graphs. The approach achieves lower RMSE than state-of-the-art baselines on Greenland IceBridge radargrams, demonstrating improved handling of noise and long-range temporal dependencies. The method's ability to adapt to differing layer counts and radargram sizes enhances the reliability of ice-thickness estimates for climate research and supports scalable spatio-temporal learning in geophysical radar data.
Abstract
Understanding the thickness and variability of internal ice layers in radar imagery is crucial for monitoring snow accumulation, assessing ice dynamics, and reducing uncertainties in climate models. Radar sensors, capable of penetrating ice, provide detailed radargram images of these internal layers. In this work, we present ST-GRIT, a spatio-temporal graph transformer for ice layer thickness, designed to process these radargrams and capture the spatiotemporal relationships between shallow and deep ice layers. ST-GRIT leverages an inductive geometric graph learning framework to extract local spatial features as feature embeddings and employs a series of temporal and spatial attention blocks separately to model long-range dependencies effectively in both dimensions. Experimental evaluation on radargram data from the Greenland ice sheet demonstrates that ST-GRIT consistently outperforms current state-of-the-art methods and other baseline graph neural networks by achieving lower root mean-squared error. These results highlight the advantages of self-attention mechanisms on graphs over pure graph neural networks, including the ability to handle noise, avoid oversmoothing, and capture long-range dependencies. Moreover, the use of separate spatial and temporal attention blocks allows for distinct and robust learning of spatial relationships and temporal patterns, providing a more comprehensive and effective approach.
