Geometry-Aware Instance Segmentation with Disparity Maps
Cho-Ying Wu, Xiaoyan Hu, Michael Happold, Qiangeng Xu, Ulrich Neumann
TL;DR
This work addresses outdoor instance segmentation by integrating stereo disparity with RGB information through GAIS-Net, enabling geometry-aware mask regression across 2D, 2.5D, and 3D ROI representations. By back-projecting disparities into 3D and employing a PointNet-based 3D mask pipeline alongside image-based 2.5D masks, the method leverages geometric priors to improve segmentation in occlusions and suppress false positives. The framework includes a mask continuity loss, a self-supervised representation correspondence loss, and a MaskIoU-driven fusion scheme that intelligently combines predictions from multiple representations. The authors introduce the HQDS dataset with a longer baseline and higher resolution, achieving state-of-the-art results on HQDS and competitive gains on Cityscapes, highlighting the practical value of stereo geometry for autonomous driving applications.
Abstract
Most previous works of outdoor instance segmentation for images only use color information. We explore a novel direction of sensor fusion to exploit stereo cameras. Geometric information from disparities helps separate overlapping objects of the same or different classes. Moreover, geometric information penalizes region proposals with unlikely 3D shapes thus suppressing false positive detections. Mask regression is based on 2D, 2.5D, and 3D ROI using the pseudo-lidar and image-based representations. These mask predictions are fused by a mask scoring process. However, public datasets only adopt stereo systems with shorter baseline and focal legnth, which limit measuring ranges of stereo cameras. We collect and utilize High-Quality Driving Stereo (HQDS) dataset, using much longer baseline and focal length with higher resolution. Our performance attains state of the art. Please refer to our project page. The full paper is available here.
