Table of Contents
Fetching ...

Spatial Analysis and Synthesis Methods: Subjective and Objective Evaluations Using Various Microphone Arrays in the Auralization of a Critical Listening Room

Alan Pawlak, Hyunkook Lee, Aki Mäkivirta, Thomas Lund

TL;DR

This paper studies the performance of various sound field reproduction methods in the context of the auralization of a critical listening room, focusing on fixed head orientations, and finds that SDM and HO-SIRR show similarities in spatial fidelity.

Abstract

Parametric sound field synthesis methods, such as the Spatial Decomposition Method (SDM) and Higher-Order Spatial Impulse Response Rendering (HO-SIRR), are widely used for the analysis and auralization of sound fields. This paper studies the performances of various sound field synthesis methods in the context of the auralization of a critical listening room. The influence on the perceived spatial and timbral fidelity of the following factors is considered: the rendering framework, direction of arrival (DOA) estimation method, microphone array structure, and use of a dedicated center reference microphone with SDM. Listening tests compare the synthesized sound fields to a reference binaural rendering condition. Several acoustic parameters are measured to gain insights into objective differences between methods. A high-quality pressure microphone improves the SDM framework's timbral fidelity. Additionally, SDM and HO-SIRR show similarities in spatial fidelity. Performance variation between SDM configurations is influenced by the DOA estimation method and microphone array construction. The binaural SDM (BSDM) presentations display temporal artifacts impacting sound quality.

Spatial Analysis and Synthesis Methods: Subjective and Objective Evaluations Using Various Microphone Arrays in the Auralization of a Critical Listening Room

TL;DR

This paper studies the performance of various sound field reproduction methods in the context of the auralization of a critical listening room, focusing on fixed head orientations, and finds that SDM and HO-SIRR show similarities in spatial fidelity.

Abstract

Parametric sound field synthesis methods, such as the Spatial Decomposition Method (SDM) and Higher-Order Spatial Impulse Response Rendering (HO-SIRR), are widely used for the analysis and auralization of sound fields. This paper studies the performances of various sound field synthesis methods in the context of the auralization of a critical listening room. The influence on the perceived spatial and timbral fidelity of the following factors is considered: the rendering framework, direction of arrival (DOA) estimation method, microphone array structure, and use of a dedicated center reference microphone with SDM. Listening tests compare the synthesized sound fields to a reference binaural rendering condition. Several acoustic parameters are measured to gain insights into objective differences between methods. A high-quality pressure microphone improves the SDM framework's timbral fidelity. Additionally, SDM and HO-SIRR show similarities in spatial fidelity. Performance variation between SDM configurations is influenced by the DOA estimation method and microphone array construction. The binaural SDM (BSDM) presentations display temporal artifacts impacting sound quality.
Paper Structure (33 sections, 3 equations, 6 figures, 3 tables)

This paper contains 33 sections, 3 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Subjective evaluation results showing median spatial and timbral fidelity scores on similarity scale (1-5), for various systems (SYSTEM). Scores are compiled from multiple source positions (POSITION) and program materials (ITEM), with 95% non-parametric confidence intervals.
  • Figure 2: Subjective evaluation results showing median for spatial and timbral fidelity scores on similarity scale (1-5), across different systems (SYSTEM) and source positions (POSITION), aggregated over all program materials (ITEM). The graph includes 95% non-parametric confidence intervals.
  • Figure 3: Subjective evaluation results showing median for spatial and timbral fidelity scores on similarity scale (1-5), across different systems (SYSTEM) and program materials (ITEM), aggregated over all source positions (POSITION). The graph includes 95% non-parametric confidence intervals.
  • Figure 4: Principal Component Analysis (PCA) of spatial and timbral fidelity scores for evaluated systems (SYSTEM), considering the median spatial and timbral fidelity scores across different program materials (ITEM) and source positions (SYSTEM). Systems are clustered in a two-dimensional space by the first two principal components, highlighting the similarities in their fidelity scores.
  • Figure 5: Correlation between median scores of spatial and timbral fidelity across all source positions and stimuli.
  • ...and 1 more figures