Diminishing Domain Mismatch for DNN-Based Acoustic Distance Estimation via Stochastic Room Reverberation Models

Tobias Gburrek; Adrian Meise; Joerg Schmalenstroeer; Reinhold Haeb-Umbach

Diminishing Domain Mismatch for DNN-Based Acoustic Distance Estimation via Stochastic Room Reverberation Models

Tobias Gburrek, Adrian Meise, Joerg Schmalenstroeer, Reinhold Haeb-Umbach

TL;DR

The paper tackles the challenge of training DNN-based acoustic distance estimators with limited real-room data by reducing domain mismatch through a hybrid RIR model that couples geometric early reflections (image-source method with cardioid directivity) with a stochastic model for late reflections. The late portion is scaled to match a room-geometry-driven energy ratio $\eta$, which includes the directivity effect via $D(\varphi,\rho)$, and uses a smooth transition function to blend early and diffuse components. A distance estimator operates on magnitude and phase-derived features and is trained as a 0.1 m classification task, with checkpoint averaging to improve cross-domain generalization. Experiments on MIRaGe and MIRD show that the proposed simulator yields better real-data generalization than pure image-source simulations, underscoring the value of realistic late reverberation modeling and source directivity in synthetic data for distance estimation.

Abstract

The room impulse response (RIR) encodes, among others, information about the distance of an acoustic source from the sensors. Deep neural networks (DNNs) have been shown to be able to extract that information for acoustic distance estimation. Since there exists only a very limited amount of annotated data, e.g., RIRs with distance information, training a DNN for acoustic distance estimation has to rely on simulated RIRs, resulting in an unavoidable mismatch to RIRs of real rooms. In this contribution, we show that this mismatch can be reduced by a novel combination of geometric and stochastic modeling of RIRs, resulting in a significantly improved distance estimation accuracy.

Diminishing Domain Mismatch for DNN-Based Acoustic Distance Estimation via Stochastic Room Reverberation Models

TL;DR

, which includes the directivity effect via

, and uses a smooth transition function to blend early and diffuse components. A distance estimator operates on magnitude and phase-derived features and is trained as a 0.1 m classification task, with checkpoint averaging to improve cross-domain generalization. Experiments on MIRaGe and MIRD show that the proposed simulator yields better real-data generalization than pure image-source simulations, underscoring the value of realistic late reverberation modeling and source directivity in synthetic data for distance estimation.

Abstract

Paper Structure (8 sections, 7 equations, 1 figure, 3 tables)

This paper contains 8 sections, 7 equations, 1 figure, 3 tables.

Introduction
Review on RIR simulation techniques
Directivity of sources and microphones
Stochastic RIR models
Proposed RIR simulation technique
Distance Estimator
Experiments
Summary

Figures (1)

Figure 1: Visualization of the proposed approach to simulation

Diminishing Domain Mismatch for DNN-Based Acoustic Distance Estimation via Stochastic Room Reverberation Models

TL;DR

Abstract

Diminishing Domain Mismatch for DNN-Based Acoustic Distance Estimation via Stochastic Room Reverberation Models

Authors

TL;DR

Abstract

Table of Contents

Figures (1)