Bilateral Reference for High-Resolution Dichotomous Image Segmentation
Peng Zheng, Dehong Gao, Deng-Ping Fan, Li Liu, Jorma Laaksonen, Wanli Ouyang, Nicu Sebe
TL;DR
This work introduces BiRefNet, a unified framework for high-resolution dichotomous image segmentation that decomposes the task into localization and reconstruction modules guided by a bilateral reference. The bilateral reference combines an inward source-image patch strategy with outward gradient supervision to preserve fine details and enhance boundary precision, complemented by targeted training strategies for DIS. Across DIS, HRSOD, COD, and SOD benchmarks, BiRefNet achieves state-of-the-art results and demonstrates strong generalization and efficiency. The approach offers practical insights for HR segmentation and provides avenues for rapid deployment and broader application in real-world scenarios.
Abstract
We introduce a novel bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS). It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef). The LM aids in object localization using global semantic information. Within the RM, we utilize BiRef for the reconstruction process, where hierarchical patches of images provide the source reference and gradient maps serve as the target reference. These components collaborate to generate the final predicted maps. We also introduce auxiliary gradient supervision to enhance focus on regions with finer details. Furthermore, we outline practical training strategies tailored for DIS to improve map quality and training process. To validate the general applicability of our approach, we conduct extensive experiments on four tasks to evince that BiRefNet exhibits remarkable performance, outperforming task-specific cutting-edge methods across all benchmarks. Our codes are available at https://github.com/ZhengPeng7/BiRefNet.
