PPEA-Depth: Progressive Parameter-Efficient Adaptation for Self-Supervised Monocular Depth Estimation
Yue-Jiang Dong, Yuan-Chen Guo, Ying-Tian Liu, Fang-Lue Zhang, Song-Hai Zhang
TL;DR
PPEA-Depth tackles the static-scene limitation in self-supervised monocular depth estimation by introducing progressive parameter-efficient adaptation with encoder and decoder adapters. It employs a two-stage training regime—first on static-scene data to learn robust depth priors, then on dynamic scenes with adapters updating while core weights remain largely fixed—achieving state-of-the-art results on KITTI, CityScapes, and DDAD. The approach preserves generalized pre-trained patterns, reduces tunable parameters by up to 90%, and demonstrates data-efficient adaptation (as little as 3% of data in Stage 2) while maintaining robustness to object motion. This method advances practical depth estimation in real-world dynamic environments and suggests that similar adapter-based transfers could extend to other tasks with loose-constrained losses.
Abstract
Self-supervised monocular depth estimation is of significant importance with applications spanning across autonomous driving and robotics. However, the reliance on self-supervision introduces a strong static-scene assumption, thereby posing challenges in achieving optimal performance in dynamic scenes, which are prevalent in most real-world situations. To address these issues, we propose PPEA-Depth, a Progressive Parameter-Efficient Adaptation approach to transfer a pre-trained image model for self-supervised depth estimation. The training comprises two sequential stages: an initial phase trained on a dataset primarily composed of static scenes, succeeded by an expansion to more intricate datasets involving dynamic scenes. To facilitate this process, we design compact encoder and decoder adapters to enable parameter-efficient tuning, allowing the network to adapt effectively. They not only uphold generalized patterns from pre-trained image models but also retain knowledge gained from the preceding phase into the subsequent one. Extensive experiments demonstrate that PPEA-Depth achieves state-of-the-art performance on KITTI, CityScapes and DDAD datasets.
