A Real Benchmark Swell Noise Dataset for Performing Seismic Data Denoising via Deep Learning
Pablo M. Barros, Roosevelt de L. Sardinha, Giovanny A. M. Arboleda, Lessandro de S. S. Valente, Isabelle R. V. de Melo, Albino Aveleda, André Bulcão, Sergio L. Netto, Alexandre G. Evsukoff
TL;DR
This work introduces a real-noise seismic denoising benchmark by embedding real swell noise into synthetic seismic data generated from elastic-wave simulations, enabling reproducible evaluation of DL methods. It compares two DL architectures, FCNN-5 and SRGEN-4, using crop-based training and a supervised learning setup, and proposes a new metric, SNR^2, to better capture small performance differences than PSNR. The results show both models can denoise effectively across varied noise levels, though preserving underlying signal while removing noise remains challenging, especially at high noise levels. By providing a publicly available dataset and a robust evaluation framework, the study supports accelerated development and fair benchmarking of DL approaches for seismic data denoising with practical industrial relevance.
Abstract
The recent development of deep learning (DL) methods for computer vision has been driven by the creation of open benchmark datasets on which new algorithms can be tested and compared with reproducible results. Although DL methods have many applications in geophysics, few real seismic datasets are available for benchmarking DL models, especially for denoising real data, which is one of the main problems in seismic data processing scenarios in the oil and gas industry. This article presents a benchmark dataset composed of synthetic seismic data corrupted with noise extracted from a filtering process implemented on real data. In this work, a comparison between two well-known DL-based denoising models is conducted on this dataset, which is proposed as a benchmark for accelerating the development of new solutions for seismic data denoising. This work also introduces a new evaluation metric that can capture small variations in model results. The results show that DL models are effective at denoising seismic data, but some issues remain to be solved.
