Table of Contents
Fetching ...

Noisereduce: Domain General Noise Reduction for Time Series Signals

Tim Sainburg, Asaf Zorea

TL;DR

Noisereduce tackles the challenge of denoising time-series signals without training data by employing spectral gating to estimate a frequency-domain mask from noise statistics. The method constructs a mask via STFT-based statistics (with a per-frequency threshold $thresh_n(f)=\mu_n(f)+k\cdot\sigma_n(f)$) and applies it to the signal's spectrogram, then reconstructs the denoised signal through iSTFT; a nonstationary variant uses sliding-window statistics to adapt thresholds over time. Across speech, bioacoustics, electrophysiology, and seismology, Noisereduce outperforms conventional baselines on several metrics (e.g., STOI, PESQ, SDR, SegSNR, AUC), while deep-learning denoisers may achieve higher scores at significantly greater computational cost. The GPU-accelerated implementation enables real-time or near-real-time processing, and the approach provides a lightweight, domain-general baseline for evaluating more complex ML-based denoising methods. The authors also provide open-source code and extensive supplementary materials to support reproducibility and broader application.

Abstract

Extracting signals from noisy backgrounds is a fundamental problem in signal processing across a variety of domains. In this paper, we introduce Noisereduce, an algorithm for minimizing noise across a variety of domains, including speech, bioacoustics, neurophysiology, and seismology. Noisereduce uses spectral gating to estimate a frequency-domain mask that effectively separates signals from noise. It is fast, lightweight, requires no training data, and handles both stationary and non-stationary noise, making it both a versatile tool and a convenient baseline for comparison with domain-specific applications. We provide a detailed overview of Noisereduce and evaluate its performance on a variety of time-domain signals.

Noisereduce: Domain General Noise Reduction for Time Series Signals

TL;DR

Noisereduce tackles the challenge of denoising time-series signals without training data by employing spectral gating to estimate a frequency-domain mask from noise statistics. The method constructs a mask via STFT-based statistics (with a per-frequency threshold ) and applies it to the signal's spectrogram, then reconstructs the denoised signal through iSTFT; a nonstationary variant uses sliding-window statistics to adapt thresholds over time. Across speech, bioacoustics, electrophysiology, and seismology, Noisereduce outperforms conventional baselines on several metrics (e.g., STOI, PESQ, SDR, SegSNR, AUC), while deep-learning denoisers may achieve higher scores at significantly greater computational cost. The GPU-accelerated implementation enables real-time or near-real-time processing, and the approach provides a lightweight, domain-general baseline for evaluating more complex ML-based denoising methods. The authors also provide open-source code and extensive supplementary materials to support reproducibility and broader application.

Abstract

Extracting signals from noisy backgrounds is a fundamental problem in signal processing across a variety of domains. In this paper, we introduce Noisereduce, an algorithm for minimizing noise across a variety of domains, including speech, bioacoustics, neurophysiology, and seismology. Noisereduce uses spectral gating to estimate a frequency-domain mask that effectively separates signals from noise. It is fast, lightweight, requires no training data, and handles both stationary and non-stationary noise, making it both a versatile tool and a convenient baseline for comparison with domain-specific applications. We provide a detailed overview of Noisereduce and evaluate its performance on a variety of time-domain signals.

Paper Structure

This paper contains 36 sections, 24 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Basic outline of Noisereduce algorithm. (A) A block diagram of the steps of Noisereduce. The stationary version of the time-frequency mask is depicted. (B) An example waveform (U.S. President George W Bush stating "I know that human beings and fish can coexist peacefully") passing through the Noisereduce pipeline. The non-stationary algorithm is not shown here.
  • Figure 2: Comparison of stationary and non-stationary noise reduction. (A) Spectrogram of clean recording of an American Robin (Macaulay Library 321642131). (B) Airplane noise imposed over Robin Recording. (C-D) Denoising of (B) with (C) stationary noisereduce and (D) nonstationary noisereduce (window size of 2 seconds) (E-F) Magnitude error in stationary noisereduce vs ground truth for (E) stationary noisereduce and (F) nonstationary noisereduce. (G) Magnitude difference between stationary and nonstationary noisereduce. (H) Error (in dB) from ground truth for stationary (green) and nonstationary (purple) noisereduce.
  • Figure 3: Noise reduction samples from different algorithms applied to the 'sp04' sample from the NOIZEUS dataset (SNR: 10 dB, exhibition noise).
  • Figure 4: Noise reduction samples from different algorithms applied to the 'B335' sample from the NOIZEUS Birdsong dataset (SNR: 10 dB, waterfall noise).
  • Figure 5: Noisereduce results on a simulated extracellular recording. (A) Sample neuron waveform templates. (B) A sample of 100ms of z-scored sampled neural data, with the original data in red and the denoised signal in black. (C-D) A spectrogram of the same data in B. (E) Amplitude of action potentials (blue) versus background noise (grey) in the original signal versus the denoised signal. (F) Reciever Operator Characteristic (ROC) curve of spike detection using the SpikeInterface detect_peaks algorithm to detect spikes.
  • ...and 3 more figures