Table of Contents
Fetching ...

INFER : Learning Implicit Neural Frequency Response Fields for Confined Car Cabin

Harshvardhan C. Takawale, Nirupam Roy, Phil Brown

TL;DR

This work addresses the challenge of accurately modeling acoustics inside confined car cabins by learning frequency-domain, complex-valued transfer fields. It introduces INFER, a two-branch implicit neural field that directly predicts a complex attenuation δ(f, x) and a directional retransmission S(f, x, n) and renders the field via frequency-domain ray marching. Its core innovations are perceptual and hardware-aware spectral supervision, and a KK-consistency regularizer that enforces causality by coupling attenuation and phase. Evaluations on simulated and real automotive datasets show substantial gains in magnitude and phase fidelity over prior time-domain and hybrid methods, highlighting the method's potential for adaptive cabin equalization, spatial audio rendering, and personalized acoustic experiences.

Abstract

Accurate modeling of spatial acoustics is critical for immersive and intelligible audio in confined, resonant environments such as car cabins. Current tuning methods are manual, hardware-intensive, and static, failing to account for frequency selective behaviors and dynamic changes like passenger presence or seat adjustments. To address this issue, we propose INFER: Implicit Neural Frequency Response fields, a frequency-domain neural framework that is jointly conditioned on source and receiver positions, orientations to directly learn complex-valued frequency response fields inside confined, resonant environments like car cabins. We introduce three key innovations over current neural acoustic modeling methods: (1) novel end-to-end frequency-domain forward model that directly learns the frequency response field and frequency-specific attenuation in 3D space; (2) perceptual and hardware-aware spectral supervision that emphasizes critical auditory frequency bands and deemphasizes unstable crossover regions; and (3) a physics-based Kramers-Kronig consistency constraint that regularizes frequency-dependent attenuation and delay. We evaluate our method over real-world data collected in multiple car cabins. Our approach significantly outperforms time- and hybrid-domain baselines on both simulated and real-world automotive datasets, cutting average magnitude and phase reconstruction errors by over 39% and 51%, respectively. INFER sets a new state-of-the-art for neural acoustic modeling in automotive spaces

INFER : Learning Implicit Neural Frequency Response Fields for Confined Car Cabin

TL;DR

This work addresses the challenge of accurately modeling acoustics inside confined car cabins by learning frequency-domain, complex-valued transfer fields. It introduces INFER, a two-branch implicit neural field that directly predicts a complex attenuation δ(f, x) and a directional retransmission S(f, x, n) and renders the field via frequency-domain ray marching. Its core innovations are perceptual and hardware-aware spectral supervision, and a KK-consistency regularizer that enforces causality by coupling attenuation and phase. Evaluations on simulated and real automotive datasets show substantial gains in magnitude and phase fidelity over prior time-domain and hybrid methods, highlighting the method's potential for adaptive cabin equalization, spatial audio rendering, and personalized acoustic experiences.

Abstract

Accurate modeling of spatial acoustics is critical for immersive and intelligible audio in confined, resonant environments such as car cabins. Current tuning methods are manual, hardware-intensive, and static, failing to account for frequency selective behaviors and dynamic changes like passenger presence or seat adjustments. To address this issue, we propose INFER: Implicit Neural Frequency Response fields, a frequency-domain neural framework that is jointly conditioned on source and receiver positions, orientations to directly learn complex-valued frequency response fields inside confined, resonant environments like car cabins. We introduce three key innovations over current neural acoustic modeling methods: (1) novel end-to-end frequency-domain forward model that directly learns the frequency response field and frequency-specific attenuation in 3D space; (2) perceptual and hardware-aware spectral supervision that emphasizes critical auditory frequency bands and deemphasizes unstable crossover regions; and (3) a physics-based Kramers-Kronig consistency constraint that regularizes frequency-dependent attenuation and delay. We evaluate our method over real-world data collected in multiple car cabins. Our approach significantly outperforms time- and hybrid-domain baselines on both simulated and real-world automotive datasets, cutting average magnitude and phase reconstruction errors by over 39% and 51%, respectively. INFER sets a new state-of-the-art for neural acoustic modeling in automotive spaces

Paper Structure

This paper contains 30 sections, 11 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Acoustic field modeling inside a car cabin.Left: Measurement setup in the backseat of a Tesla Model X, where a speaker emits sound and spatial responses are recorded over a 2D grid. Middle: Spatial distribution of phase at 720Hz for ground truth (GT), our method (Ours), and baselines (AVR, NAF, INRAS). Right: Corresponding log-magnitude (energy) plots. Our method reconstructs smoother and physically consistent fields that preserve wavefront geometry and acoustic shadowing, outperforming baselines that exhibit artifacts or spatial inconsistency.
  • Figure 2: System Overview. Illustration of our frequency-domain acoustic forward model. For each point sampled along rays cast from the microphone, the MLP predicts a frequency-domain signal and attenuation. A TOF-based phase shift is applied to the signal and material based absorption and phase shifts are applied to produce the final response by accumulating signal from all directions.
  • Figure 3: Data collection setup. (a)Left: Data is collected in controlled environment - The BUCK, which is a vehicle mockup with realistic car interior and acoustic frontend. (b)Right: Data is also collected in real environment - Tesla Model X.
  • Figure 4: Qualitative results. (Left) Spatial plots comparing ground truth (GT), and our method's ability to reconstruct magnitude and phase field across various frequency bands.