Table of Contents
Fetching ...

The role of direct sound spherical harmonics representation in externalization using binaural reproduction

Eran Miller, Boaz Rafaely

TL;DR

This study investigates how the direct sound component influences externalization in binaural playback using Ambisonics-based representations. By employing a mixed SH-order framework, where the direct path is encoded at a high order $N_d$ and the reverberant path at a lower order $N_r$, the authors quantify externalization through a formal MUSHRA listening test in two reverberant environments. The key finding is that enhancing the direct component substantially improves externalization, with the mixed SH-order signal often matching the externalization level of a full $N=3$ Ambisonics signal and outperforming first-order Ambisonics. These results have practical implications for spatial-audio workflows, suggesting that improving direct-sound representation can yield meaningful perceptual benefits without fully upgrading the entire SH order.

Abstract

The importance of the information in the direct sound to human perception of spatial sound sources is an ongoing research topic. The classification between direct sound and diffuse or reverberant sound forms the basis of numerous studies in the field of spatial audio. In particular, parametric spatial audio representation methods use this classification and employ signal processing in order to enhance the audio quality at reproduction. However, current literature does not provide information concerning the impact of ideal direct sound representation on externalization, in the context of Ambisonics. This paper aims to assess the importance of the spatial information in the direct sound in the externalization of a sound field when using binaural reproduction. This is done in the spherical harmonics (SH) domain, where an ideal direct sound representation within an otherwise Ambisonics signal is simulated, and its perceived externalization is evaluated in a formal listening test. This investigation leads to the conclusion that externalization of a first order Ambisonics signal may be significantly improved by enhancing the direct sound component, up to a level similar to a third order Ambisonics signal.

The role of direct sound spherical harmonics representation in externalization using binaural reproduction

TL;DR

This study investigates how the direct sound component influences externalization in binaural playback using Ambisonics-based representations. By employing a mixed SH-order framework, where the direct path is encoded at a high order and the reverberant path at a lower order , the authors quantify externalization through a formal MUSHRA listening test in two reverberant environments. The key finding is that enhancing the direct component substantially improves externalization, with the mixed SH-order signal often matching the externalization level of a full Ambisonics signal and outperforming first-order Ambisonics. These results have practical implications for spatial-audio workflows, suggesting that improving direct-sound representation can yield meaningful perceptual benefits without fully upgrading the entire SH order.

Abstract

The importance of the information in the direct sound to human perception of spatial sound sources is an ongoing research topic. The classification between direct sound and diffuse or reverberant sound forms the basis of numerous studies in the field of spatial audio. In particular, parametric spatial audio representation methods use this classification and employ signal processing in order to enhance the audio quality at reproduction. However, current literature does not provide information concerning the impact of ideal direct sound representation on externalization, in the context of Ambisonics. This paper aims to assess the importance of the spatial information in the direct sound in the externalization of a sound field when using binaural reproduction. This is done in the spherical harmonics (SH) domain, where an ideal direct sound representation within an otherwise Ambisonics signal is simulated, and its perceived externalization is evaluated in a formal listening test. This investigation leads to the conclusion that externalization of a first order Ambisonics signal may be significantly improved by enhancing the direct sound component, up to a level similar to a third order Ambisonics signal.
Paper Structure (9 sections, 7 equations, 1 figure, 2 tables)

This paper contains 9 sections, 7 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Results for the externalization ratings in environment 1 and environment 2. Box plot visualization, marking the median with a red line where the bottom and top edges of the box represent the 25th and 75th percentiles, respectively. The whiskers represent the variability of the ratings outside the upper and lower quartiles. Outliers are marked with red '+'. The width of the box plot notches has been calculated such that boxes with non-overlapping notches have medians which are different at the 5$\%$ significance level