Variability in Performance of a Machine-Learning Seismicity Catalog: Central Italy, 2016-2017

Jaehong Chung; Yifan Yu; Lauro Chiaraluce; Maddalena Michele; Gregory C. Beroza

Variability in Performance of a Machine-Learning Seismicity Catalog: Central Italy, 2016-2017

Jaehong Chung, Yifan Yu, Lauro Chiaraluce, Maddalena Michele, Gregory C. Beroza

TL;DR

The study tackles how machine-learning–based seismic catalogs affect detection performance across a network, not just in aggregate. It introduces a probability-based magnitude-of-completeness (PMC) framework that converts station-level detection probabilities into spatial maps of $M_c(\mathbf{x})$, using a logistic regression model for P- and S-waves and a minimum eight-station criterion to define network detectability. Results show broad reductions in $M_c(\mathbf{x})$ with ML catalogs, but with greater across-station variability, and pronounced gains in densely instrumented regions—especially for S-waves—highlighting both the benefits and the limits of ML-based monitoring. The framework provides a practical tool for evaluating catalog quality, guiding network design, and informing seismic-risk assessments with spatially resolved detectability metrics.

Abstract

Machine learning (ML) catalogs contain many more earthquakes than routine catalogs, but their performance in phase picking and earthquake detection has not been fully evaluated. We develop station-level detection probabilities using logistic regression and combine them across a seismic network to compute spatial magnitude-of-completeness fields. We apply this approach to two catalogs from the 2016-2017 Central Italy sequence that were constructed from the same seismic network, one routine and one ML based. At the station level, the ML picker increases detection sensitivity by identifying smaller magnitude events and detecting earthquakes at greater distances. Spatially, the magnitude-of-completeness decreases substantially, with median values shifting from 1.6 to 0.5 for P waves and from 1.7 to 0.5 for S waves. However, the ML catalog also shows greater variability in station-level performance than the routine catalog. These results demonstrate that ML-based improvements in detectability are widespread but spatially non-uniform, highlighting their benefits, their limitations, and the potential for further improvements.

Variability in Performance of a Machine-Learning Seismicity Catalog: Central Italy, 2016-2017

TL;DR

, using a logistic regression model for P- and S-waves and a minimum eight-station criterion to define network detectability. Results show broad reductions in

with ML catalogs, but with greater across-station variability, and pronounced gains in densely instrumented regions—especially for S-waves—highlighting both the benefits and the limits of ML-based monitoring. The framework provides a practical tool for evaluating catalog quality, guiding network design, and informing seismic-risk assessments with spatially resolved detectability metrics.

Abstract

Paper Structure (10 sections, 3 equations, 7 figures)

This paper contains 10 sections, 3 equations, 7 figures.

Introduction
Data: routine vs. machine learning catalogs for the 2016-2017 Central Italy sequence
Methods
Station-level detection probability
Spatial detection probability and magnitude of completeness
Results and Discussion
Station-level Detectability Differences
Spatial detection evolution
Potential and Limitations of the ML Catalog and the PMC Framework
Conclusions

Figures (7)

Figure 1: Earthquake catalog data: (a) Map of the study area showing earthquakes from the routine catalog (olive) together with stations of the INGV network (triangles). (b) Same region, plotted for events in the ML catalog (blue). (c) Magnitude–frequency distributions for both catalogs, shown as histograms (filled) and cumulative curves for local magnitude ($M_L$). A bin width of 0.1 magnitude units is used for both datasets. Previously reported global completeness magnitudes are $M_c = 1.6$ for the routine catalog and $M_c = 0.2$ for the ML-based catalog chiaraluce2022comprehensive.
Figure 2: Workflow for quantifying spatial, probability-based magnitude of completeness: (a) Station-level detected and non-detected P-wave picks for an example station, FEMA, in the IV network, shown in both map view and magnitude–distance space, (b) Station-level detection probability calculated from the observations in (a). (c) Spatial detection probability for selected magnitudes; examples shown for $M=1.1$ and $M=1.3$. (d) Resulting spatial magnitude of completeness map ($M_c (\bm{x})$).
Figure 3: Derived detection probability for station GIGS in the IV network, with detected (black dots) and non-detected (gray crosses) events. The left column shows the routine catalog and the right column the ML-based catalog; the top row is for P waves and the bottom row for S waves. Additional examples from other stations are provided in Figures S2—S5.
Figure 4: Station-level detection probability as a function of magnitude at a fixed distance of 50 km. The left column shows the routine catalog and the right column the ML-based catalog; the top row shows P waves and the bottom row S waves. Thick curves show the median detection probability for each catalog and phase, and thin lines show individual-station curves.
Figure 5: Station-level detection probability as a function of distance at a fixed magnitude of $M_L = 1.5$. The left column shows the routine catalog and the right column the ML-based catalog; the top row shows P waves and the bottom row S waves. Thick curves show the median detection probability for each catalog and phase, and thin lines show individual-station curves.
...and 2 more figures

Variability in Performance of a Machine-Learning Seismicity Catalog: Central Italy, 2016-2017

TL;DR

Abstract

Variability in Performance of a Machine-Learning Seismicity Catalog: Central Italy, 2016-2017

Authors

TL;DR

Abstract

Table of Contents

Figures (7)