Table of Contents
Fetching ...

Implicit neural representation with physics-informed neural networks for the reconstruction of the early part of room impulse responses

Mirco Pezzoli, Fabio Antonacci, Augusto Sarti

TL;DR

The paper addresses reconstructing the early part of room impulse responses from undersampled multichannel measurements by marrying a neural implicit representation with physics-based regularization. It introduces PI-SIREN, a 5-layer SIREN network trained with a data fidelity term and a wave equation PDE loss to enforce physics in the time domain. Across simulated and real data, PI-SIREN outperforms a standard PINN and a plain SIREN, and remains competitive with state-of-the-art compressed sensing and deep-prior methods while maintaining a lightweight model. This physics-informed, data-efficient approach can enhance accurate early-RIR reconstruction, with potential benefits for localization, timbre, and room characterization in immersive audio applications.

Abstract

Recently deep learning and machine learning approaches have been widely employed for various applications in acoustics. Nonetheless, in the area of sound field processing and reconstruction classic methods based on the solutions of wave equation are still widespread. Recently, physics-informed neural networks have been proposed as a deep learning paradigm for solving partial differential equations which govern physical phenomena, bridging the gap between purely data-driven and model based methods. Here, we exploit physics-informed neural networks to reconstruct the early part of missing room impulse responses in an uniform linear array. This methodology allows us to exploit the underlying law of acoustics, i.e., the wave equation, forcing the neural network to generate physically meaningful solutions given only a limited number of data points. The results on real measurements show that the proposed model achieves accurate reconstruction and performance in line with respect to state-of-the-art deep-learning and compress sensing techniques while maintaining a lightweight architecture.

Implicit neural representation with physics-informed neural networks for the reconstruction of the early part of room impulse responses

TL;DR

The paper addresses reconstructing the early part of room impulse responses from undersampled multichannel measurements by marrying a neural implicit representation with physics-based regularization. It introduces PI-SIREN, a 5-layer SIREN network trained with a data fidelity term and a wave equation PDE loss to enforce physics in the time domain. Across simulated and real data, PI-SIREN outperforms a standard PINN and a plain SIREN, and remains competitive with state-of-the-art compressed sensing and deep-prior methods while maintaining a lightweight model. This physics-informed, data-efficient approach can enhance accurate early-RIR reconstruction, with potential benefits for localization, timbre, and room characterization in immersive audio applications.

Abstract

Recently deep learning and machine learning approaches have been widely employed for various applications in acoustics. Nonetheless, in the area of sound field processing and reconstruction classic methods based on the solutions of wave equation are still widespread. Recently, physics-informed neural networks have been proposed as a deep learning paradigm for solving partial differential equations which govern physical phenomena, bridging the gap between purely data-driven and model based methods. Here, we exploit physics-informed neural networks to reconstruct the early part of missing room impulse responses in an uniform linear array. This methodology allows us to exploit the underlying law of acoustics, i.e., the wave equation, forcing the neural network to generate physically meaningful solutions given only a limited number of data points. The results on real measurements show that the proposed model achieves accurate reconstruction and performance in line with respect to state-of-the-art deep-learning and compress sensing techniques while maintaining a lightweight architecture.
Paper Structure (11 sections, 10 equations, 3 figures, 1 table)

This paper contains 11 sections, 10 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Example of RIRs $\mathbf{H}$ of a $M=32$ microphones ULA.
  • Figure 2: (a) Simulated RIRs $\mathbf{H}$. (b) The observation $\mathbf{H}_{\tilde{m}}$ of $\tilde{M}=33$ microphones employed as input for the networks. The reconstructions obtained using PI-SIREN (c), SIREN (d) and PINN (e).
  • Figure 3: Reconstruction of the RIRs of Munin room using $\tilde{M}=20$ available sensors. (a) The measured RIRs $\mathbf{H}$. The reconstructions are obtained using the proposed model $\hat{\mathbf{H}}_{\mathrm{PI-SIREN}}$ (b), DP pezzoli2022deep (c) and CS zea2019compressed (d).