Exploring Challenges in Deep Learning of Single-Station Ground Motion Records
Ümit Mert Çağlar, Baris Yilmaz, Melek Türkmen, Erdem Akagündüz, Salih Tileylioglu
TL;DR
This work interrogates whether deep learning on single-station ground motion records truly learns waveform features or instead leverages auxiliary P/S phase information to predict epicentral distance. Using the STEAD dataset, it benchmarks ResNet and Temporal Convolutional Network encoders, performing an ablation by including or excluding a P/S channel on both local and global subsets. The results show a strong dependence on P/S phase information, with Pearson $r=0.956$ and Spearman $ ho=0.926$ correlations between P/S differences and distance, and substantial performance gains when PS is included (e.g., TCN local 1.74 km vs 7.00 km without PS). The study demonstrates that current DL approaches to single-station seismic data may overfit to auxiliary cues, underscoring the need for robust designs and clearer baselines for waveform-based epicentral distance estimation in seismology.
Abstract
Contemporary deep learning models have demonstrated promising results across various applications within seismology and earthquake engineering. These models rely primarily on utilizing ground motion records for tasks such as earthquake event classification, localization, earthquake early warning systems, and structural health monitoring. However, the extent to which these models truly extract meaningful patterns from these complex time-series signals remains underexplored. In this study, our objective is to evaluate the degree to which auxiliary information, such as seismic phase arrival times or seismic station distribution within a network, dominates the process of deep learning from ground motion records, potentially hindering its effectiveness. Our experimental results reveal a strong dependence on the highly correlated Primary (P) and Secondary (S) phase arrival times. These findings expose a critical gap in the current research landscape, highlighting the lack of robust methodologies for deep learning from single-station ground motion recordings that do not rely on auxiliary inputs.
