BoRe-Depth: Self-supervised Monocular Depth Estimation with Boundary Refinement for Embedded Systems
Chang Liu, Juan Li, Sheng Zhang, Chang Liu, Jie Li, Xu Zhang
TL;DR
BoRe-Depth tackles boundary blur in self-supervised monocular depth estimation for embedded systems by combining a lightweight MPViT encoder with an Enhanced Feature Adaptive Fusion (EFAF) decoder and a two-stage training regime that incorporates a semantic information loss. The method uses pseudo-depth labels for supervision and a frozen semantic encoder to transfer semantic knowledge, yielding sharper object boundaries while maintaining real-time performance (50.7 FPS) on NVIDIA Jetson Orin with 8.7M parameters. It achieves state-of-the-art boundary quality and depth accuracy on NYUv2 and KITTI, with strong zero-shot generalization on iBims-1, and is supported by comprehensive ablations validating each component. The practical impact lies in deploying accurate, boundary-aware depth maps on resource-constrained platforms for autonomous systems and AR/robotics applications.
Abstract
Depth estimation is one of the key technologies for realizing 3D perception in unmanned systems. Monocular depth estimation has been widely researched because of its low-cost advantage, but the existing methods face the challenges of poor depth estimation performance and blurred object boundaries on embedded systems. In this paper, we propose a novel monocular depth estimation model, BoRe-Depth, which contains only 8.7M parameters. It can accurately estimate depth maps on embedded systems and significantly improves boundary quality. Firstly, we design an Enhanced Feature Adaptive Fusion Module (EFAF) which adaptively fuses depth features to enhance boundary detail representation. Secondly, we integrate semantic knowledge into the encoder to improve the object recognition and boundary perception capabilities. Finally, BoRe-Depth is deployed on NVIDIA Jetson Orin, and runs efficiently at 50.7 FPS. We demonstrate that the proposed model significantly outperforms previous lightweight models on multiple challenging datasets, and we provide detailed ablation studies for the proposed methods. The code is available at https://github.com/liangxiansheng093/BoRe-Depth.
