Canonical Factors for Hybrid Neural Fields
Brent Yi, Weijia Zeng, Sam Buchanan, Yi Ma
TL;DR
This work tackles axis-aligned biases in factored feature volumes used by hybrid neural fields. It introduces TILTED, a transform-invariant latent decomposition that learns canonical factors together with domain transformations, coupled with coarse-to-fine optimization to handle high-frequency content. Theoretical results in a 2D model justify joint learning of alignment and representation, and extensive experiments across 2D images, SDFs, and NeRFs demonstrate improved quality, robustness, and memory/runtime efficiency, including real-world scene gains such as halved parameter counts and a 25% faster training time. The findings also reveal evaluation biases in radiance-field pipelines and suggest broader applicability to more transformations and overparameterized settings.
Abstract
Factored feature volumes offer a simple way to build more compact, efficient, and intepretable neural fields, but also introduce biases that are not necessarily beneficial for real-world data. In this work, we (1) characterize the undesirable biases that these architectures have for axis-aligned signals -- they can lead to radiance field reconstruction differences of as high as 2 PSNR -- and (2) explore how learning a set of canonicalizing transformations can improve representations by removing these biases. We prove in a two-dimensional model problem that simultaneously learning these transformations together with scene appearance succeeds with drastically improved efficiency. We validate the resulting architectures, which we call TILTED, using image, signed distance, and radiance field reconstruction tasks, where we observe improvements across quality, robustness, compactness, and runtime. Results demonstrate that TILTED can enable capabilities comparable to baselines that are 2x larger, while highlighting weaknesses of neural field evaluation procedures.
