Table of Contents
Fetching ...

EarthquakeNPP: A Benchmark for Earthquake Forecasting with Neural Point Processes

Samuel Stockman, Daniel Lawson, Maximilian Werner

TL;DR

This work introduces EarthquakeNPP: a benchmarking platform that curates and standardizes existing public resources: globally available earthquake catalogs, the ETAS model, and evaluation protocols from the seismology community, to foster future collaboration between the seismology and machine learning communities.

Abstract

For decades, classical point process models, such as the epidemic-type aftershock sequence (ETAS) model, have been widely used for forecasting the event times and locations of earthquakes. Recent advances have led to Neural Point Processes (NPPs), which promise greater flexibility and improvements over such classical models. However, the currently-used benchmark for NPPs does not represent an up-to-date challenge in the seismological community, since it contains data leakage and omits the largest earthquake sequence from the region. Additionally, initial earthquake forecasting benchmarks fail to compare NPPs with state-of-the-art forecasting models commonly used in seismology. To address these gaps, we introduce EarthquakeNPP: a benchmarking platform that curates and standardizes existing public resources: globally available earthquake catalogs, the ETAS model, and evaluation protocols from the seismology community. The datasets cover a range of small to large target regions within California, dating from 1971 to 2021, and include different methodologies for dataset generation. Benchmarking experiments, using both log-likelihood and generative evaluation metrics widely recognised in seismology, show that none of the five NPPs tested outperform ETAS. These findings suggest that current NPP implementations are not yet suitable for practical earthquake forecasting. Nonetheless, EarthquakeNPP provides a platform to foster future collaboration between the seismology and machine learning communities.

EarthquakeNPP: A Benchmark for Earthquake Forecasting with Neural Point Processes

TL;DR

This work introduces EarthquakeNPP: a benchmarking platform that curates and standardizes existing public resources: globally available earthquake catalogs, the ETAS model, and evaluation protocols from the seismology community, to foster future collaboration between the seismology and machine learning communities.

Abstract

For decades, classical point process models, such as the epidemic-type aftershock sequence (ETAS) model, have been widely used for forecasting the event times and locations of earthquakes. Recent advances have led to Neural Point Processes (NPPs), which promise greater flexibility and improvements over such classical models. However, the currently-used benchmark for NPPs does not represent an up-to-date challenge in the seismological community, since it contains data leakage and omits the largest earthquake sequence from the region. Additionally, initial earthquake forecasting benchmarks fail to compare NPPs with state-of-the-art forecasting models commonly used in seismology. To address these gaps, we introduce EarthquakeNPP: a benchmarking platform that curates and standardizes existing public resources: globally available earthquake catalogs, the ETAS model, and evaluation protocols from the seismology community. The datasets cover a range of small to large target regions within California, dating from 1971 to 2021, and include different methodologies for dataset generation. Benchmarking experiments, using both log-likelihood and generative evaluation metrics widely recognised in seismology, show that none of the five NPPs tested outperform ETAS. These findings suggest that current NPP implementations are not yet suitable for practical earthquake forecasting. Nonetheless, EarthquakeNPP provides a platform to foster future collaboration between the seismology and machine learning communities.

Paper Structure

This paper contains 43 sections, 19 equations, 30 figures, 6 tables.

Figures (30)

  • Figure 1: Earthquakes contained in the observational datasets found in EarthquakeNPP. Colours indicate the respective datasets, including the target region, magnitude of completeness $M_c$, number of events and the time period that the dataset spans. In red is a fault map from the GEM Global Active Faults Database styron2020gem.
  • Figure 2: Test temporal log-likelihood scores for all the spatio-temporal point process models on each of the EarthquakeNPP datasets. SCEDC_20, SCEDC_25 and SCEDC_30 correspond to magnitude thresholds (Mw 2.0, 2.5, 3.0) of the SCEDC dataset. Error bars of the mean and standard deviation are constructed for the NPPs using three repeat runs.
  • Figure 3: Test spatial log-likelihood scores for all the spatio-temporal point process models on each of the EarthquakeNPP datasets. SCEDC_20, SCEDC_25 and SCEDC_30 correspond to magnitude thresholds (Mw 2.0, 2.5, 3.0) of the SCEDC dataset. Error bars of the mean and standard deviation are constructed for the NPPs using three repeat runs.
  • Figure 4: Forecasts from ETAS, SMASH, and DSTPP during the 2010 M7.2 El Mayor-Cucapah earthquake contained in the ComCat dataset. Top: Spatial forecasts for the day following the mainshock. ETAS accurately captures the primary aftershock zone along the Laguna Salada fault system. SMASH produces smoother forecasts with broader spatial spread, while DSTPP concentrates its probability mass north of the mainshock epicenter. Bottom: Cumulative earthquake counts over time, with magnitudes shown as scaled orange circles. Forecast number distributions from each model are plotted with 95% confidence intervals. All models initially underestimate aftershock activity. ETAS and SMASH begin to recover after the first week, whereas DSTPP continues to systematically underpredict event counts throughout the sequence.
  • Figure 5: Generating an earthquake catalog involves several key steps: seismic phase picking, magnitude estimation, and the association and location of seismic sources. This process transforms raw waveform data recorded at seismic stations to locations, times, and magnitudes of earthquakes.
  • ...and 25 more figures