Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition
Vladimir Tourbabin, Boaz Rafaely
TL;DR
This work addresses the lack of a theoretical framework for designing microphone arrays on humanoid robots by introducing a wide-band GHRTF-based measurement model and a principled array-quality measure, the effective rank $\mathcal{R}(\mathbf{H})$. It shows how a higher $\mathcal{R}(\mathbf{H})$ improves beamformer robustness and MUSIC DOA accuracy and proposes a GA-based optimization to place microphones on a head surface to maximize $\mathcal{R}$. The authors validate the framework with a GHRTF database generated on a KU-100 head, demonstrating that high-rank arrays yield significantly better robustness (up to tens of dB) and lower DOA variance, especially under low SNR and at low frequencies. This work provides a practical, information-theoretic criterion for principled microphone placement in humanoid-robot audition, enabling improved sound localization and separation in realistic multi-source environments.
Abstract
An important aspect of a humanoid robot is audition. Previous work has presented robot systems capable of sound localization and source segregation based on microphone arrays with various configurations. However, no theoretical framework for the design of these arrays has been presented. In the current paper, a design framework is proposed based on a novel array quality measure. The measure is based on the effective rank of a matrix composed of the generalized head related transfer functions (GHRTFs) that account for microphone positions other than the ears. The measure is shown to be theoretically related to standard array performance measures such as beamforming robustness and DOA estimation accuracy. Then, the measure is applied to produce sample designs of microphone arrays. Their performance is investigated numerically, verifying the advantages of array design based on the proposed theoretical framework.
