Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition

Vladimir Tourbabin; Boaz Rafaely

Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition

Vladimir Tourbabin, Boaz Rafaely

TL;DR

This work addresses the lack of a theoretical framework for designing microphone arrays on humanoid robots by introducing a wide-band GHRTF-based measurement model and a principled array-quality measure, the effective rank $\mathcal{R}(\mathbf{H})$. It shows how a higher $\mathcal{R}(\mathbf{H})$ improves beamformer robustness and MUSIC DOA accuracy and proposes a GA-based optimization to place microphones on a head surface to maximize $\mathcal{R}$. The authors validate the framework with a GHRTF database generated on a KU-100 head, demonstrating that high-rank arrays yield significantly better robustness (up to tens of dB) and lower DOA variance, especially under low SNR and at low frequencies. This work provides a practical, information-theoretic criterion for principled microphone placement in humanoid-robot audition, enabling improved sound localization and separation in realistic multi-source environments.

Abstract

An important aspect of a humanoid robot is audition. Previous work has presented robot systems capable of sound localization and source segregation based on microphone arrays with various configurations. However, no theoretical framework for the design of these arrays has been presented. In the current paper, a design framework is proposed based on a novel array quality measure. The measure is based on the effective rank of a matrix composed of the generalized head related transfer functions (GHRTFs) that account for microphone positions other than the ears. The measure is shown to be theoretically related to standard array performance measures such as beamforming robustness and DOA estimation accuracy. Then, the measure is applied to produce sample designs of microphone arrays. Their performance is investigated numerically, verifying the advantages of array design based on the proposed theoretical framework.

Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition

TL;DR

. It shows how a higher

improves beamformer robustness and MUSIC DOA accuracy and proposes a GA-based optimization to place microphones on a head surface to maximize

. The authors validate the framework with a GHRTF database generated on a KU-100 head, demonstrating that high-rank arrays yield significantly better robustness (up to tens of dB) and lower DOA variance, especially under low SNR and at low frequencies. This work provides a practical, information-theoretic criterion for principled microphone placement in humanoid-robot audition, enabling improved sound localization and separation in realistic multi-source environments.

Abstract

Paper Structure (19 sections, 35 equations, 9 figures)

This paper contains 19 sections, 35 equations, 9 figures.

Introduction
Background
Measurement model
Beamforming
Direction of arrival estimation
Generalized measurement model
Effective Rank - A Measure of Array Quality
Optimal Microphone Positioning and Array Performance
Optimal microphone positioning
Relation to Beamformer Robustness
Relation to DOA estimation accuracy
Simulation Study
Computation of GHRTFs from head geometry
Relation between the geometry and the effective rank
Sensor positioning example
...and 4 more sections

Figures (9)

Figure 1: The dummy head geometry model that was used for the construction of the GHRTF database.
Figure 2: Effective rank calculated for three different source arrangements: (a) -- sources on horizontal plane, (b) -- sources on median plane, (c) -- nearly uniformly distributed sources. Left column represents the effective rank, right column illustrates the source distribution.
Figure 3: Plot of the GHRTFs at the frequency of $3$ kHz for the points $1,2$ and $3$ (see Fig. \ref{['fig:2']}b). The points, along with the associated effective ranks (measured in dimentions - [dim]), are indicated in the legend. For convenience, the phase component is plotted as the propagation delay relative to the propagation from zero elevation.
Figure 4: Representative examples of optimal positioning for $2,3,5$ and $10$ microphone arrays, obtained using the GA approach. Microphone positions are indicated by the black circles.
Figure 5: The ratio between the sensitivity of the MER array and the sensitivity of an array designed to have an effective rank that is smaller than the maximum effective rank by: (a) 1 dimension, (b) 5 dimensions.
...and 4 more figures

Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition

TL;DR

Abstract

Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition

Authors

TL;DR

Abstract

Table of Contents

Figures (9)