Table of Contents
Fetching ...

Estimation of binary time-frequency masks from ambient noise

José Luis Romero, Michael Speckbacher

TL;DR

This work formalizes the intuition that a binary time-frequency mask can be recovered from ambient noise by analyzing the average spectrogram of filtered noise. The authors introduce a practical estimator based on the lower-quantile of the averaged spectrogram and prove that, under a finite-perimeter largeness condition on $\Omega$, the recovered mask $\widehat{\Omega}$ matches $\Omega$ up to a boundary layer with high probability, independent of the noise variance. They extend the results to real white noise and provide an expectation bound showing the reconstruction error scales with the boundary length $|\partial\Omega|$, with the error shrinking as $K$ grows. The analysis hinges on the spectral properties of the time-frequency localization operator $H_\Omega$, concentration of measure for quadratic forms, and reproducing-kernel techniques to control uniform deviations, yielding practical guidance for choosing $K$ and windows in ambient-noise scenarios.

Abstract

We investigate the retrieval of a binary time-frequency mask from a few observations of filtered white ambient noise. Confirming household wisdom in acoustic modeling, we show that this is possible by inspecting the average spectrogram of ambient noise. Specifically, we show that the lower quantile of the average of $\mathcal{O}(\log(|Ω|/\varepsilon))$ masked spectrograms is enough to identify a rather general mask $Ω$ with confidence at least $\varepsilon$, up to shape details concentrated near the boundary of $Ω$. As an application, the expected measure of the estimation error is dominated by the perimeter of the time-frequency mask. The estimator requires no knowledge of the noise variance, and only a very qualitative profile of the filtering window, but no exact knowledge of it.

Estimation of binary time-frequency masks from ambient noise

TL;DR

This work formalizes the intuition that a binary time-frequency mask can be recovered from ambient noise by analyzing the average spectrogram of filtered noise. The authors introduce a practical estimator based on the lower-quantile of the averaged spectrogram and prove that, under a finite-perimeter largeness condition on , the recovered mask matches up to a boundary layer with high probability, independent of the noise variance. They extend the results to real white noise and provide an expectation bound showing the reconstruction error scales with the boundary length , with the error shrinking as grows. The analysis hinges on the spectral properties of the time-frequency localization operator , concentration of measure for quadratic forms, and reproducing-kernel techniques to control uniform deviations, yielding practical guidance for choosing and windows in ambient-noise scenarios.

Abstract

We investigate the retrieval of a binary time-frequency mask from a few observations of filtered white ambient noise. Confirming household wisdom in acoustic modeling, we show that this is possible by inspecting the average spectrogram of ambient noise. Specifically, we show that the lower quantile of the average of masked spectrograms is enough to identify a rather general mask with confidence at least , up to shape details concentrated near the boundary of . As an application, the expected measure of the estimation error is dominated by the perimeter of the time-frequency mask. The estimator requires no knowledge of the noise variance, and only a very qualitative profile of the filtering window, but no exact knowledge of it.
Paper Structure (23 sections, 16 theorems, 141 equations, 2 figures)

This paper contains 23 sections, 16 theorems, 141 equations, 2 figures.

Key Result

Theorem 2.1

Let $g,\varphi\in\mathcal{S}(\mathbb{R}^d)$ with $\left\| g \right\|_{2}=\left\| \varphi \right\|_{2}=1$ be model and reconstruction windows. Let $\Omega\subset \mathbb{R}^{2d}$ be compact and satisfying the largeness condition eq_as2. Let $K \geq 4$, assume that $K$ independent realizations of $H_\

Figures (2)

  • Figure 1: Time-frequency filters can suppress details along the mask boundary. Plot of $|V_\varphi f|\cdot \chi_\Omega$ (left) and $|V_\varphi H_\Omega f|$ (right), with $\varphi(t)=2^{1/4}e^{-\pi t^2}$, $f$ a linear combination of time-frequency shifts of $\varphi$, and $\Omega$ a disk (top) compared to a slightly perturbed disk (bottom).
  • Figure 2: The symmetric difference $\Omega\Delta\widehat{\Omega}$ is depicted in grey and $\partial \Omega$ in black. Parameters: $|\Omega|=100$; $K=20$; $\varphi(t)=2^{1/4}e^{-\pi t^2}$; left column: $g=\varphi$; right column: $g(t)= \varphi(t)t^2$.

Theorems & Definitions (21)

  • Theorem 2.1: Recovery up to boundary details
  • Theorem 2.2
  • Theorem 2.3
  • Theorem 3.1
  • proof
  • Lemma 4.1
  • Lemma 4.2
  • proof
  • Lemma 5.1
  • Theorem 5.2
  • ...and 11 more