Table of Contents
Fetching ...

Why some audio signal short-time Fourier transform coefficients have nonuniform phase distributions

Stephen D. Voran

TL;DR

The paper shows that STFT phases $\phi_k$ are not globally uniform; per-bin phase distributions are nonuniform due to tonal content and window sidelobes. It derives the tone-to-bin phase mapping, revealing nonlinear relationships that create intrinsic phase peaks in each coefficient, and demonstrates how window shape modulates these effects. The key contributions include closed-form relationships for tonal impact on $\phi_k$, identification of four intrinsic phase peaks per coefficient, and empirical evidence across diverse audio shows that more sidelobe-suppressive windows yield more uniform phase distributions. The findings imply that per-bin phase priors, rather than a global uniform prior, can improve phase-aware tasks such as reconstruction and source separation.

Abstract

The short-time Fourier transform (STFT) represents a window of audio samples as a set of complex coefficients. These are advantageously viewed as magnitudes and phases and the overall distribution of phases is very often assumed to be uniform. We show that when audio signal STFT phase distributions are analyzed per-frequency or per-magnitude range, they can be far from uniform. That is, the uniform phase distribution assumption obscures significant important details. We explain the significance of the nonuniform phase distributions and how they might be exploited, derive their source, and explain why the choice of the STFT window shape influences the nonuniformity of the resulting phase distributions.

Why some audio signal short-time Fourier transform coefficients have nonuniform phase distributions

TL;DR

The paper shows that STFT phases are not globally uniform; per-bin phase distributions are nonuniform due to tonal content and window sidelobes. It derives the tone-to-bin phase mapping, revealing nonlinear relationships that create intrinsic phase peaks in each coefficient, and demonstrates how window shape modulates these effects. The key contributions include closed-form relationships for tonal impact on , identification of four intrinsic phase peaks per coefficient, and empirical evidence across diverse audio shows that more sidelobe-suppressive windows yield more uniform phase distributions. The findings imply that per-bin phase priors, rather than a global uniform prior, can improve phase-aware tasks such as reconstruction and source separation.

Abstract

The short-time Fourier transform (STFT) represents a window of audio samples as a set of complex coefficients. These are advantageously viewed as magnitudes and phases and the overall distribution of phases is very often assumed to be uniform. We show that when audio signal STFT phase distributions are analyzed per-frequency or per-magnitude range, they can be far from uniform. That is, the uniform phase distribution assumption obscures significant important details. We explain the significance of the nonuniform phase distributions and how they might be exploited, derive their source, and explain why the choice of the STFT window shape influences the nonuniformity of the resulting phase distributions.
Paper Structure (11 sections, 16 equations, 5 figures, 1 table)

This paper contains 11 sections, 16 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Audio STFT phase histogram images. Left: per-frequency histograms where each vertical stripe shows a phase histogram for a given frequency. Right: per-magnitude histograms where each annular region shows a phase histogram for a magnitude range (smallest magnitudes at origin). White indicates highest probability.
  • Figure 2: Example shows how tone frequency $\omega_t$ drives $c_\Re/c_\Im$ and $P_{2\pi}(\zeta_\Re\!-\!\zeta_\Im)$ per (\ref{['eqn:atanRect']}) for each of the seven STFT coefficients of interest when $N=16$. STFT coefficient frequencies shown by dotted vertical lines. Right panel uses same color coding as left panel.
  • Figure 3: Five example relationships between tone phase $\theta$ and STFT coefficient phase $\phi_k$. Conditions for each example given in Table \ref{['table:examples']}.
  • Figure 4: STFT phase histogram image shows that tones with uniformly distributed frequencies and phases produce nonuniformly distributed STFT phases. White is highest probability.
  • Figure 5: Increasing window shape parameter (see (\ref{['eqn:window']}) and \ref{['ssec:tonePhase']}) increases sidelobe suppression and thus decreases nonuniformity $\overline{u}$ of phase distributions.