Room Transfer Function Reconstruction Using Complex-valued Neural Networks and Irregularly Distributed Microphones
Francesca Ronchini, Luca Comanducci, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti
TL;DR
This work tackles reconstructing room transfer functions (RTFs) to recover a full complex sound field from sparse, irregular microphone measurements. It introduces a complex-valued neural network (CVNN) with a U-Net-like architecture that ingests incomplete RTF data $\tilde{\mathbf{G}}$ and a measurement mask, learning a mapping $\mathcal{U}(\tilde{\mathbf{G}})$ to full complex RTFs across $K$ frequency bins in the modal range $[30,300]$ Hz. The method is evaluated on 5{,}000 synthetic rooms and real ISOBEL Room B data, demonstrating superior complex NMSE and phase fidelity compared to kernel-based interpolation, and showing favorable comparisons with magnitude-only approaches especially at low frequencies. This CVNN-based approach enables accurate, phase-aware sound-field reconstruction with relatively few sensors, supporting improved immersive audio, room acoustics analysis, and practical deployment in varied rooms.
Abstract
Reconstructing the room transfer functions needed to calculate the complex sound field in a room has several important real-world applications. However, an unpractical number of microphones is often required. Recently, in addition to classical signal processing methods, deep learning techniques have been applied to reconstruct the room transfer function starting from a very limited set of measurements at scattered points in the room. In this paper, we employ complex-valued neural networks to estimate room transfer functions in the frequency range of the first room resonances, using a few irregularly distributed microphones. To the best of our knowledge, this is the first time that complex-valued neural networks are used to estimate room transfer functions. To analyze the benefits of applying complex-valued optimization to the considered task, we compare the proposed technique with a state-of-the-art kernel-based signal processing approach for sound field reconstruction, showing that the proposed technique exhibits relevant advantages in terms of phase accuracy and overall quality of the reconstructed sound field. For informative purposes, we also compare the model with a similarly-structured data-driven approach that, however, applies a real-valued neural network to reconstruct only the magnitude of the sound field.
