Reciprocal Latent Fields for Precomputed Sound Propagation
Hugo Seuté, Pranai Vasudev, Etienne Richan, Louis-Xavier Buffoni
TL;DR
The paper tackles the challenge of real-time, physically plausible sound propagation in complex scenes by introducing Reciprocal Latent Fields (RLF), a memory-efficient framework that encodes reciprocal acoustic paths as a latent grid with symmetry-enforcing decoders. It explores Euclidean and Riemannian decoder variants and extends the approach to multiple acoustic parameters (levels, decay times) beyond path distance, achieving substantial memory reductions while maintaining perceptual fidelity. Through extensive quantitative and a MUSHRA-style subjective study, RLF demonstrates near-ground-truth quality with orders-of-magnitude lower memory requirements, enabling real-time rendering in large game maps. The work also outlines practical training setups, robust latent space designs, and highlights limitations related to static geometries and potential avenues for dynamic or broader reciprocal-quantity extensions.
Abstract
Realistic sound propagation is essential for immersion in a virtual scene, yet physically accurate wave-based simulations remain computationally prohibitive for real-time applications. Wave coding methods address this limitation by precomputing and compressing impulse responses of a given scene into a set of scalar acoustic parameters, which can reach unmanageable sizes in large environments with many source-receiver pairs. We introduce Reciprocal Latent Fields (RLF), a memory-efficient framework for encoding and predicting these acoustic parameters. The RLF framework employs a volumetric grid of trainable latent embeddings decoded with a symmetric function, ensuring acoustic reciprocity. We study a variety of decoders and show that leveraging Riemannian metric learning leads to a better reproduction of acoustic phenomena in complex scenes. Experimental validation demonstrates that RLF maintains replication quality while reducing the memory footprint by several orders of magnitude. Furthermore, a MUSHRA-like subjective listening test indicates that sound rendered via RLF is perceptually indistinguishable from ground-truth simulations.
