Table of Contents
Fetching ...

Some clues to build a sound analysis relevant to hearing

Laurent Millot

TL;DR

This work addresses the mismatch between conventional sound analysis tools and human listening. It advocates a perceptually grounded framework built on the à trous multi-band decomposition, offering two frequency mappings (audio/octave and Leipp sensible bands) and ISD/TISD visualizations to enable listening to each component with minimal distortion. The paper details the algorithmic core, frequency-mapping choices, and display concepts, and outlines collaborative experiments with musicians and engineers to refine mappings and real-time viability. By prioritizing perceptual relevance and modularity, the approach aims to unify analysis for acoustics, perception research, and sound production, with potential applications in ambient sound extraction and instrument modeling.

Abstract

Analysis tools used in research laboratories, for sound synthesis, by musicians or sound engineers can be rather different. Discussion of the assumptions and of the limitations of these tools permits to propose a first tool as relevant and versatile as possible for all the sound actors with a major aim: one must be able to listen to each element of the analysis because hearing is the final reference tool. This tool should also be used, in the future, to reinvestigate the definition of sound (or Acoustics) on the basis of some recent works on musical instrument modeling, speech production and loudspeakers design. Audio illustrations will be given.Paper 6041 presented at the 116th Convention of the Audio Engineering Society, Berlin, 2004

Some clues to build a sound analysis relevant to hearing

TL;DR

This work addresses the mismatch between conventional sound analysis tools and human listening. It advocates a perceptually grounded framework built on the à trous multi-band decomposition, offering two frequency mappings (audio/octave and Leipp sensible bands) and ISD/TISD visualizations to enable listening to each component with minimal distortion. The paper details the algorithmic core, frequency-mapping choices, and display concepts, and outlines collaborative experiments with musicians and engineers to refine mappings and real-time viability. By prioritizing perceptual relevance and modularity, the approach aims to unify analysis for acoustics, perception research, and sound production, with potential applications in ambient sound extraction and instrument modeling.

Abstract

Analysis tools used in research laboratories, for sound synthesis, by musicians or sound engineers can be rather different. Discussion of the assumptions and of the limitations of these tools permits to propose a first tool as relevant and versatile as possible for all the sound actors with a major aim: one must be able to listen to each element of the analysis because hearing is the final reference tool. This tool should also be used, in the future, to reinvestigate the definition of sound (or Acoustics) on the basis of some recent works on musical instrument modeling, speech production and loudspeakers design. Audio illustrations will be given.Paper 6041 presented at the 116th Convention of the Audio Engineering Society, Berlin, 2004
Paper Structure (11 sections, 10 figures)

This paper contains 11 sections, 10 figures.

Figures (10)

  • Figure 1: examples of waveforms: (left) spoken message in french, (right) over-pressure measurement for a normal G blow (fourth channel) on a G diatonic harmonica.
  • Figure 2: zoom to point out a few "periods" of the over-pressure signal for the former blown note on a diatonic harmonica.
  • Figure 3: Spectras of both test signals: (left) speech, (right) pressure measurement.
  • Figure 4: Spectrograms of both test signals using 2048 points FFT: (left) speech, (right) pressure measurement.
  • Figure 5: Mallat algorithm for fast dyadic wavelet transform : (left) analysis stage, (right) synthesis stage.
  • ...and 5 more figures