Table of Contents
Fetching ...

Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling

Pu Ren, Rie Nakata, Maxime Lacour, Ilan Naiman, Nori Nakata, Jialin Song, Zhengfa Bi, Osman Asif Malik, Dmitriy Morozov, Omri Azencot, N. Benjamin Erichson, Michael W. Mahoney

Abstract

Predicting high-fidelity ground motions for future earthquakes is crucial for seismic hazard assessment and infrastructure resilience. Conventional empirical simulations suffer from sparse sensor distribution and geographically localized earthquake locations, while physics-based methods are computationally intensive and require accurate representations of Earth structures and earthquake sources. We propose a novel artificial intelligence (AI) simulator, Conditional Generative Modeling for Ground Motion (CGM-GM), to synthesize high-frequency and spatially continuous earthquake ground motion waveforms. CGM-GM leverages earthquake magnitudes and geographic coordinates of earthquakes and sensors as inputs, learning complex wave physics and Earth heterogeneities, without explicit physics constraints. This is achieved through a probabilistic autoencoder that captures latent distributions in the time-frequency domain and variational sequential models for prior and posterior distributions. We evaluate the performance of CGM-GM using small-magnitude earthquake records from the San Francisco Bay Area, a region with high seismic risks. CGM-GM demonstrates a strong potential for outperforming a state-of-the-art non-ergodic empirical ground motion model and shows great promise in seismology and beyond.

Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling

Abstract

Predicting high-fidelity ground motions for future earthquakes is crucial for seismic hazard assessment and infrastructure resilience. Conventional empirical simulations suffer from sparse sensor distribution and geographically localized earthquake locations, while physics-based methods are computationally intensive and require accurate representations of Earth structures and earthquake sources. We propose a novel artificial intelligence (AI) simulator, Conditional Generative Modeling for Ground Motion (CGM-GM), to synthesize high-frequency and spatially continuous earthquake ground motion waveforms. CGM-GM leverages earthquake magnitudes and geographic coordinates of earthquakes and sensors as inputs, learning complex wave physics and Earth heterogeneities, without explicit physics constraints. This is achieved through a probabilistic autoencoder that captures latent distributions in the time-frequency domain and variational sequential models for prior and posterior distributions. We evaluate the performance of CGM-GM using small-magnitude earthquake records from the San Francisco Bay Area, a region with high seismic risks. CGM-GM demonstrates a strong potential for outperforming a state-of-the-art non-ergodic empirical ground motion model and shows great promise in seismology and beyond.
Paper Structure (8 sections, 20 equations, 14 figures, 1 table)

This paper contains 8 sections, 20 equations, 14 figures, 1 table.

Figures (14)

  • Figure 1: Overview of our proposed CGM-GM for ground motion synthesis. a illustrates the entire pipeline of the CGM-GM framework, where STFT is applied to extract time-frequency features and the dynamic VAE model is designed for learning amplitude information. We leverage the true phase information for waveform reconstruction during training and consider phase retrieval methods in the stage of generation. Note that the ground motion sequence $\mathbf{x}(t)$ is in the time domain (with $T$ time step) and the amplitude spectrogram $\mathbf{A}(t,\omega)$ is in the time-frequency domain (with a time resolution of $\tau$). b presents the network architecture of the dynamic VAE. $\mathbf{v}$ denotes the concatenation of multiple conditional variables, which are embedded into the VAE model. c shows the details of designing a sequential prior distribution, where RNNs are used to incorporate dynamics into the model prior. d displays the embedding module of conditional variables, where MLP layers (green blocks) and ReLU activation functions (red blocks) are used. $\mathbf{e}_v$ denotes the latent feature of conditional variables. e shows the illustrative waveform comparison between the generations (blue) and the corresponding ground truth (red). It shows ground motion sequences of the H1 component with different pairs of earthquake magnitudes $M$, rupture distances $R_{hyp}$, earthquake depths $D_{hyp}$, and epicenter-station azimuths $A_{hyp}$. For each scenario, two waveforms are randomly generated given the same conditional variables. More examples of generated waveforms can be found in the Supplemental Material \ref{['s:supp_results_wfs_moderate']} and \ref{['s:supp_results_wfs_h2']}, including those showing moderate performance and the generations for the H2 component.
  • Figure 2: Illustrative examples of generated FAS maps. a-c show the FAS maps of non-ergodic GMM, CGM-baseline, and our CGM-GM at a frequency of 10 Hz. The red star and blue triangle denote the earthquake source and observation station, respectively. The seismic event is characterized by a magnitude of 3.84 and a depth of 7.94 km. The epicenter, denoted by a red star, is located at a geographic position with a latitude of $37^{\circ}51.6'N$ and a longitude of $122^{\circ}15.6'W$. A specific spatial region in SFBA is selected for evaluation. We provide more generations of FAS maps under various earthquake scenarios in the Supplementary Material \ref{['s:supp_results_fas_map']}. d and e exhibit the FAS difference (Residual) between ground truth and the simulated samples from our generative model and the baseline models (non-ergodic GMM and CGM-baseline) for all earthquake recordings across the entire range of frequencies between 2 and 15 Hz. The discrepancy is calculated by the logarithmic residual of FAS values. The solid line and the shaded area denote the mean curves and the uncertainty region of mean $\pm$ std.
  • Figure 3: The comparative analysis of amplitude spectra information between the ground truth and the generated samples. a shows the comparison of FAS results from true seismic recordings and the generated waveforms across diverse conditional variables. The earthquake depths are within a fixed range in all plots. We show FAS results at the 15th percentile, mean, and 85th percentile. "Gen." denotes the results from generations. b compares amplitude spectra heatmaps from ground truth and generated data. The error heatmap is based on the logarithmic division of the ground truth and the generations.
  • Figure 4: The statistical evaluation of the generated waveforms. a presents the PGV distribution versus rupture distances. The solid lines denote the mean curves and the dashed lines show the boundaries of mean $\pm$ std. b is the scatter plot of the arrival time versus distances for generated and truth waveforms. c and d show the cross-validations of the arrival time of P and S waves in the EW direction respectively, including scatter plots colored by density and reference fit lines. The density is normalized between 0 and 1.
  • Figure 5: An overview of the earthquake dataset of the H1 component in the SFBA. a shows the distribution of earthquake depths. b presents the magnitude-distance distribution of this dataset. Each dot indicates the magnitude-distance of each source-receiver pair. c and d are the spatial distributions of observation stations and earthquake sources, respectively.
  • ...and 9 more figures