Table of Contents
Fetching ...

Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition

Vladimir Tourbabin, Boaz Rafaely

TL;DR

This work addresses the lack of a theoretical framework for designing microphone arrays on humanoid robots by introducing a wide-band GHRTF-based measurement model and a principled array-quality measure, the effective rank $\mathcal{R}(\mathbf{H})$. It shows how a higher $\mathcal{R}(\mathbf{H})$ improves beamformer robustness and MUSIC DOA accuracy and proposes a GA-based optimization to place microphones on a head surface to maximize $\mathcal{R}$. The authors validate the framework with a GHRTF database generated on a KU-100 head, demonstrating that high-rank arrays yield significantly better robustness (up to tens of dB) and lower DOA variance, especially under low SNR and at low frequencies. This work provides a practical, information-theoretic criterion for principled microphone placement in humanoid-robot audition, enabling improved sound localization and separation in realistic multi-source environments.

Abstract

An important aspect of a humanoid robot is audition. Previous work has presented robot systems capable of sound localization and source segregation based on microphone arrays with various configurations. However, no theoretical framework for the design of these arrays has been presented. In the current paper, a design framework is proposed based on a novel array quality measure. The measure is based on the effective rank of a matrix composed of the generalized head related transfer functions (GHRTFs) that account for microphone positions other than the ears. The measure is shown to be theoretically related to standard array performance measures such as beamforming robustness and DOA estimation accuracy. Then, the measure is applied to produce sample designs of microphone arrays. Their performance is investigated numerically, verifying the advantages of array design based on the proposed theoretical framework.

Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition

TL;DR

This work addresses the lack of a theoretical framework for designing microphone arrays on humanoid robots by introducing a wide-band GHRTF-based measurement model and a principled array-quality measure, the effective rank . It shows how a higher improves beamformer robustness and MUSIC DOA accuracy and proposes a GA-based optimization to place microphones on a head surface to maximize . The authors validate the framework with a GHRTF database generated on a KU-100 head, demonstrating that high-rank arrays yield significantly better robustness (up to tens of dB) and lower DOA variance, especially under low SNR and at low frequencies. This work provides a practical, information-theoretic criterion for principled microphone placement in humanoid-robot audition, enabling improved sound localization and separation in realistic multi-source environments.

Abstract

An important aspect of a humanoid robot is audition. Previous work has presented robot systems capable of sound localization and source segregation based on microphone arrays with various configurations. However, no theoretical framework for the design of these arrays has been presented. In the current paper, a design framework is proposed based on a novel array quality measure. The measure is based on the effective rank of a matrix composed of the generalized head related transfer functions (GHRTFs) that account for microphone positions other than the ears. The measure is shown to be theoretically related to standard array performance measures such as beamforming robustness and DOA estimation accuracy. Then, the measure is applied to produce sample designs of microphone arrays. Their performance is investigated numerically, verifying the advantages of array design based on the proposed theoretical framework.
Paper Structure (19 sections, 35 equations, 9 figures)

This paper contains 19 sections, 35 equations, 9 figures.

Figures (9)

  • Figure 1: The dummy head geometry model that was used for the construction of the GHRTF database.
  • Figure 2: Effective rank calculated for three different source arrangements: (a) -- sources on horizontal plane, (b) -- sources on median plane, (c) -- nearly uniformly distributed sources. Left column represents the effective rank, right column illustrates the source distribution.
  • Figure 3: Plot of the GHRTFs at the frequency of $3$ kHz for the points $1,2$ and $3$ (see Fig. \ref{['fig:2']}b). The points, along with the associated effective ranks (measured in dimentions - [dim]), are indicated in the legend. For convenience, the phase component is plotted as the propagation delay relative to the propagation from zero elevation.
  • Figure 4: Representative examples of optimal positioning for $2,3,5$ and $10$ microphone arrays, obtained using the GA approach. Microphone positions are indicated by the black circles.
  • Figure 5: The ratio between the sensitivity of the MER array and the sensitivity of an array designed to have an effective rank that is smaller than the maximum effective rank by: (a) 1 dimension, (b) 5 dimensions.
  • ...and 4 more figures