The role of direct sound spherical harmonics representation in externalization using binaural reproduction
Eran Miller, Boaz Rafaely
TL;DR
This study investigates how the direct sound component influences externalization in binaural playback using Ambisonics-based representations. By employing a mixed SH-order framework, where the direct path is encoded at a high order $N_d$ and the reverberant path at a lower order $N_r$, the authors quantify externalization through a formal MUSHRA listening test in two reverberant environments. The key finding is that enhancing the direct component substantially improves externalization, with the mixed SH-order signal often matching the externalization level of a full $N=3$ Ambisonics signal and outperforming first-order Ambisonics. These results have practical implications for spatial-audio workflows, suggesting that improving direct-sound representation can yield meaningful perceptual benefits without fully upgrading the entire SH order.
Abstract
The importance of the information in the direct sound to human perception of spatial sound sources is an ongoing research topic. The classification between direct sound and diffuse or reverberant sound forms the basis of numerous studies in the field of spatial audio. In particular, parametric spatial audio representation methods use this classification and employ signal processing in order to enhance the audio quality at reproduction. However, current literature does not provide information concerning the impact of ideal direct sound representation on externalization, in the context of Ambisonics. This paper aims to assess the importance of the spatial information in the direct sound in the externalization of a sound field when using binaural reproduction. This is done in the spherical harmonics (SH) domain, where an ideal direct sound representation within an otherwise Ambisonics signal is simulated, and its perceived externalization is evaluated in a formal listening test. This investigation leads to the conclusion that externalization of a first order Ambisonics signal may be significantly improved by enhancing the direct sound component, up to a level similar to a third order Ambisonics signal.
