Table of Contents
Fetching ...

High-Fidelity Compression of Seismic Velocity Models via SIREN Auto-Decoders

Caiyun Liu, Xiaoxue Luo, Jie Xiong

Abstract

Implicit Neural Representations (INRs) have emerged as a powerful paradigm for representing continuous signals independently of grid resolution. In this paper, we propose a high-fidelity neural compression framework based on a SIREN (Sinusoidal Representation Networks) auto-decoder to represent multi-structural seismic velocity models from the OpenFWI benchmark. Our method compresses each 70x70 velocity map (4,900 points) into a compact 256-dimensional latent vector, achieving a compression ratio of 19:1. We evaluate the framework on 1,000 samples across five diverse geological families: FlatVel, CurveVel, FlatFault, CurveFault, and Style. Experimental results demonstrate an average PSNR of 32.47 dB and SSIM of 0.956, indicating high-quality reconstruction. Furthermore, we showcase two key advantages of our implicit representation: (1) smooth latent space interpolation that generates plausible intermediate velocity structures, and (2) zero-shot super-resolution capability that reconstructs velocity fields at arbitrary resolutions up to 280x280 without additional training. The results highlight the potential of INR-based auto-decoders for efficient storage, multi-scale analysis, and downstream geophysical applications such as full waveform inversion.

High-Fidelity Compression of Seismic Velocity Models via SIREN Auto-Decoders

Abstract

Implicit Neural Representations (INRs) have emerged as a powerful paradigm for representing continuous signals independently of grid resolution. In this paper, we propose a high-fidelity neural compression framework based on a SIREN (Sinusoidal Representation Networks) auto-decoder to represent multi-structural seismic velocity models from the OpenFWI benchmark. Our method compresses each 70x70 velocity map (4,900 points) into a compact 256-dimensional latent vector, achieving a compression ratio of 19:1. We evaluate the framework on 1,000 samples across five diverse geological families: FlatVel, CurveVel, FlatFault, CurveFault, and Style. Experimental results demonstrate an average PSNR of 32.47 dB and SSIM of 0.956, indicating high-quality reconstruction. Furthermore, we showcase two key advantages of our implicit representation: (1) smooth latent space interpolation that generates plausible intermediate velocity structures, and (2) zero-shot super-resolution capability that reconstructs velocity fields at arbitrary resolutions up to 280x280 without additional training. The results highlight the potential of INR-based auto-decoders for efficient storage, multi-scale analysis, and downstream geophysical applications such as full waveform inversion.
Paper Structure (43 sections, 10 equations, 9 figures, 13 tables, 1 algorithm)

This paper contains 43 sections, 10 equations, 9 figures, 13 tables, 1 algorithm.

Figures (9)

  • Figure 1: Comparison of grid-based and implicit neural representations for seismic velocity fields. (a) Ground truth velocity model at $70\times 70$ resolution (velocity range: 1500-4000 m/s). (b) Traditional grid-based representation with $15\times 15$ sampling, showing discretization artifacts at sharp interfaces (highlighted in red). The yellow dashed rectangle indicates a single grid cell with constant velocity value. (c) SIREN implicit representation reconstructing the full $70\times 70$ field from a 256-dimensional latent code, demonstrating continuous, artifact-free modeling with PSNR of 37.93 dB. The white contour lines illustrate the continuous nature of the implicit representation.
  • Figure 2: Spectral bias comparison between standard ReLU MLP and SIREN. (a) Standard ReLU MLP exhibits strong low-frequency bias, with amplitude dropping rapidly above 25 Hz, failing to capture high-frequency components. (b) SIREN with periodic activation functions maintains near-uniform response across a wide frequency range, enabling accurate representation of sharp features and discontinuities.
  • Figure 3: Comparison of auto-encoder and auto-decoder architectures. (a) Traditional auto-encoder: input $x$ is encoded into a latent code $z$ by $E_\phi$, then decoded to $\hat{x}$ by $D_\theta$. (b) Auto-decoder (ours): learnable latent codes $\{z_i\}$ are directly optimized with a shared decoder $f_\theta$, eliminating the encoder and using $L_2$ regularization.
  • Figure 4: SIREN auto-decoder framework overview. Input spatial coordinates $(x,z)$ and a learnable latent code $\mathbf{z}_i$ are concatenated and fed into the SIREN decoder $f_\theta$, which consists of multiple layers with periodic activation functions $\sin(\omega_0 \cdot)$. The decoder outputs the reconstructed velocity field $\hat{\mathbf{V}}_i(x,z)$. During training, the decoder parameters and latent codes are jointly optimized using reconstruction loss and $L_2$ regularization.
  • Figure 5: Comparison of training and inference phases in the auto-decoder framework. (a) Training phase: learnable latent codes $\{\mathbf{z}_i\}$ and shared decoder $f_\theta$ are jointly optimized. Reconstruction loss (MSE) compares output $\hat{\mathbf{V}}_i$ with ground truth $\mathbf{V}_i$. (b) Inference phase: fixed latent codes $\mathbf{z}_i^*$ and trained decoder $f_\theta^*$ are used for efficient reconstruction without further optimization.
  • ...and 4 more figures