Depth Completion as Parameter-Efficient Test-Time Adaptation
Bingxin Ke, Qunjie Zhou, Jiahui Huang, Xuanchi Ren, Tianchang Shen, Konrad Schindler, Laura Leal-Taixé, Shengyu Huang
TL;DR
CAPA reframes depth completion as test-time, parameter-efficient adaptation of frozen 3D foundation models to sparse depth cues, grounding strong geometric priors with per-sample gradients. It implements two PEFT strategies, LoRA and Visual Prompt Tuning, to update only a tiny fraction of parameters, enabling efficient per-sample (and sequence-level for videos) fine-tuning while preserving the backbone. Across indoor and outdoor datasets, CAPA achieves state-of-the-art accuracy and superior temporal consistency, significantly beating baselines and improving the base model by 2–3x in error reduction. This approach enables robust, scene-specific depth reconstruction on standard hardware and paves the way for practical high-fidelity 3D mapping and regeneration tasks with minimal computation.
Abstract
We introduce CAPA, a parameter-efficient test-time optimization framework that adapts pre-trained 3D foundation models (FMs) for depth completion, using sparse geometric cues. Unlike prior methods that train task-specific encoders for auxiliary inputs, which often overfit and generalize poorly, CAPA freezes the FM backbone. Instead, it updates only a minimal set of parameters using Parameter-Efficient Fine-Tuning (e.g. LoRA or VPT), guided by gradients calculated directly from the sparse observations available at inference time. This approach effectively grounds the foundation model's geometric prior in the scene-specific measurements, correcting distortions and misplaced structures. For videos, CAPA introduces sequence-level parameter sharing, jointly adapting all frames to exploit temporal correlations, improve robustness, and enforce multi-frame consistency. CAPA is model-agnostic, compatible with any ViT-based FM, and achieves state-of-the-art results across diverse condition patterns on both indoor and outdoor datasets. Project page: research.nvidia.com/labs/dvl/projects/capa.
