Detecting Spatiotemporal b-Value Anomalies with a Progressive Deep Learning Architecture

Jonas Köhler; Wei Li; Johannes Faber; Georg Rümpker; Nishtha Srivastava

Detecting Spatiotemporal b-Value Anomalies with a Progressive Deep Learning Architecture

Jonas Köhler, Wei Li, Johannes Faber, Georg Rümpker, Nishtha Srivastava

TL;DR

This paper develops a methodological framework to detect spatiotemporal anomalies in evolving daily $b$-value fields over Japan by constructing fine‑scale $b$-value maps and framing anomaly detection as a binary classification on 512 × 32 × 32 blocks. A hybrid CNN–TCN architecture processes spatial patterns and temporal dynamics, while a progressive time‑forward training scheme mitigates nonstationarity and prevents future information leakage. Internal validation across a time-forward catalog demonstrates how data density, sample balance, and large aftershock sequences (notably the 2011 Tōhoku event) shape model behavior and performance, with MAE-based configurations generally performing best. The work emphasizes careful interpretation of anomaly scores, attributes model behavior to catalog characteristics, and outlines a path toward adapting the approach for rate-like forecasts and broader seismic regimes. Overall, it provides a structured, reproducible framework for exploring how spatiotemporal $b$-value evolution relates to large earthquakes while highlighting methodological considerations and limitations.

Abstract

Identifying systematic patterns in seismicity that precede large earthquakes remains a central challenge in statistical seismology. In this work, we present a methodological framework for detecting spatiotemporal anomalies in seismicity using the evolution of gridded b-values. Focusing on the Japanese subduction zone, we construct daily b-value fields on a fine spatial grid by aggregating local seismicity over moving time windows, yielding a continuous 2+1D representation of seismic-state evolution. We formulate the problem as a binary classification task in which spatiotemporal blocks extracted from these $b$-value fields are labeled according to the occurrence of a target earthquake with \Mw $\geq 5$ in the central region within the next day. To model this data, we introduce a hybrid deep-learning architecture that combines a spatial convolutional encoder with a temporal convolutional network, enabling joint learning of spatial structure and temporal dynamics. A progressive meta-epoch training scheme is employed, in which the model is iteratively updated using a time-forward strategy that mirrors operational deployment and mitigates issues related to nonstationarity. This paper is strictly methodological in scope. It describes the construction of b-value fields, the spatiotemporal sampling strategy, the network architecture, and the progressive training and internal validation framework used for model development and parameter selection.

Detecting Spatiotemporal b-Value Anomalies with a Progressive Deep Learning Architecture

TL;DR

This paper develops a methodological framework to detect spatiotemporal anomalies in evolving daily

-value fields over Japan by constructing fine‑scale

-value maps and framing anomaly detection as a binary classification on 512 × 32 × 32 blocks. A hybrid CNN–TCN architecture processes spatial patterns and temporal dynamics, while a progressive time‑forward training scheme mitigates nonstationarity and prevents future information leakage. Internal validation across a time-forward catalog demonstrates how data density, sample balance, and large aftershock sequences (notably the 2011 Tōhoku event) shape model behavior and performance, with MAE-based configurations generally performing best. The work emphasizes careful interpretation of anomaly scores, attributes model behavior to catalog characteristics, and outlines a path toward adapting the approach for rate-like forecasts and broader seismic regimes. Overall, it provides a structured, reproducible framework for exploring how spatiotemporal

-value evolution relates to large earthquakes while highlighting methodological considerations and limitations.

Abstract

-value fields are labeled according to the occurrence of a target earthquake with \Mw

in the central region within the next day. To model this data, we introduce a hybrid deep-learning architecture that combines a spatial convolutional encoder with a temporal convolutional network, enabling joint learning of spatial structure and temporal dynamics. A progressive meta-epoch training scheme is employed, in which the model is iteratively updated using a time-forward strategy that mirrors operational deployment and mitigates issues related to nonstationarity. This paper is strictly methodological in scope. It describes the construction of b-value fields, the spatiotemporal sampling strategy, the network architecture, and the progressive training and internal validation framework used for model development and parameter selection.

Paper Structure (32 sections, 5 equations, 25 figures, 2 tables)

This paper contains 32 sections, 5 equations, 25 figures, 2 tables.

Introduction
Data
$b$-Value Calculation
Labeled Sample Construction
Methods
Problem formulation
Model Architecture
Progressive Meta-Epoch Training Scheme
Parameter Search
Internal Validation Metrics
Results
Internal Validation
Parameter search Results
Discussion
Overall Method Behavior
...and 17 more sections

Figures (25)

Figure 1: Overview of the study area. Shown are all earthquakes between 1999-01-01 to 2019-12-31 colored by depth and scaled by magnitude. The tectonic boundaries (red) are taken from Bird2003.
Figure 2: Architecture illustration for the model architecture. The model is used to reduce the input data from $512 \times 32 \times 32 \times 1$ dimensions to 1. Since the second and third value for the dimensionality of that tensor are always the same, we collapse them to one dimension in the sketch. The 3D convolutional layer transforms its input dimensionality from $t \times j \times j \times k$ to $t\times \frac{j}{2} \times \frac{j}{2} \times 2k$, while the dilation is a temporally acting 3D convolutional layer with a dilation set to the powers of 2. This reduces the dimensionality in the first value by a factor of 2 each time it is called. The model output is an uncalibrated anomaly score.
Figure 3: This figure shows the earthquake distribution in validation data. The validation is chunked in meta epochs different lengths (explained in section \ref{['sec:training']}). The left columns shows how often meta epochs with a certain number of events occur depending on the meta epoch lengths of 14, 30, or 50 days. The very high numbers correspond to the Tōhoku earthquakes and the months that follow. This can be seen in the right column, where the number of events in each meta epoch is shown over the meta epochs.
Figure 4: Availability of samples for selection as an nEQ sample. The upper panel shows the availability of samples over time. For Model4.9 (blue) this is noisy but constant except for the time after the Tōhoku earthquake, where a large portion of the domain is active and excluded. Model4.0 (red) also has less samples from the beginning of the period. The left panels shows available pool for the model with the highest training validation an overall winner of the parameter search, while the right panel shows the pool available to the best performing model with $M_\mathrm{lim} = 4.9$. Especially the region between $35^\circ\!-\!40^\circ\,\text{N} \times 140^\circ\!-\!145^\circ\,\text{E}$ (which contains the Tōhoku earthquake) is mostly missing in the former case but readily available in the later case. An equivalent overview for all parameter configurations is Figure \ref{['fig:neq_pool_all']}.
Figure 5: Classification Accuracy for meta epochs. The markers each correspond to one meta epoch and are colored based on the number of samples in the validation set for that meta epoch, however we limited the color range from 0 to 50. There are single meta epochs with much higher counts, namely those following to the Tōhoku Earthquake. A closer look at the events in the validation data is provided in Figure \ref{['fig:Res_Val_EQoccurrence']}. The lines show different running averages for the accuracy: The light blue line is a weighted average (since the meta epochs contain different numbers of samples), of 5 meta epochs, the magenta line is the 20 meta epoch weighted average. Black shows the cumulative accuracy, which indicates some progression after the initial meta epochs. The two vertical dotted lines correspond to the Tōhoku earthquake and the end of the initial training period at the end of 2019. The dotted horizontal line shows the level of the training set accuracy. The solid black line after 2019 corresponds to the cumulative accuracy on the new, unseen data, where after giving a forecast for the next 14 days, the model is retrained for the next 14 days. From 2020 a new cumulative mean is shown.
...and 20 more figures

Detecting Spatiotemporal b-Value Anomalies with a Progressive Deep Learning Architecture

TL;DR

Abstract

Detecting Spatiotemporal b-Value Anomalies with a Progressive Deep Learning Architecture

Authors

TL;DR

Abstract

Table of Contents

Figures (25)