NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization
Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux
TL;DR
HRTF upsampling and personalization remain challenging with sparse measurements. The authors introduce NIIRF, which jointly learns a neural field that outputs parameters for a cascade of parametric IIR filters within a differentiable DSP framework, rather than estimating HRTF magnitude for FIR reconstruction. NIIRF achieves competitive or superior performance to prior magnitude-based NF methods, especially when data are limited, and supports efficient subject personalization via auto-decoding, BitFit, or LoRA. This approach reduces memory and computation while delivering accurate, direction-continuous HRTFs suitable for immersive audio applications; the codebase is made available for replication and extension.
Abstract
Head-related transfer functions (HRTFs) are important for immersive audio, and their spatial interpolation has been studied to upsample finite measurements. Recently, neural fields (NFs) which map from sound source direction to HRTF have gained attention. Existing NF-based methods focused on estimating the magnitude of the HRTF from a given sound source direction, and the magnitude is converted to a finite impulse response (FIR) filter. We propose the neural infinite impulse response filter field (NIIRF) method that instead estimates the coefficients of cascaded IIR filters. IIR filters mimic the modal nature of HRTFs, thus needing fewer coefficients to approximate them well compared to FIR filters. We find that our method can match the performance of existing NF-based methods on multiple datasets, even outperforming them when measurements are sparse. We also explore approaches to personalize the NF to a subject and experimentally find low-rank adaptation to be effective.
