Occlusion-Ordered Semantic Instance Segmentation
Soroosh Baselizadeh, Cheuk-To Yu, Olga Veksler, Yuri Boykov
TL;DR
This work addresses 3D reasoning from a single image by introducing Occlusion-Ordered Semantic Instance Segmentation (OOSIS), a joint task that couples relative depth ordering from occlusions with semantic instance segmentation. It uses a two-stage approach: a CNN predicts semantic labels, oriented occlusion boundaries, and boundary orientations, while a CRF-based labeling simultaneously infers instance masks and their occlusion order, enabling a coherent 3D-like reasoning from 2D input. The authors introduce a novel oriented occlusion boundary model, a jump move based CRF optimization with a submodular upper bound, and a new OOSIS evaluation metric called the OAIR curve that jointly measures mask accuracy and occlusion order correctness. Empirical results on KINS and COCOA show state-of-the-art performance on the joint task and demonstrate advantages over baselines that address only either instance segmentation or occlusion ordering, highlighting the practical potential for improved 3D understanding in single-image scenes.
Abstract
Standard semantic instance segmentation provides useful, but inherently 2D information from a single image. To enable 3D analysis, one usually integrates absolute monocular depth estimation with instance segmentation. However, monocular depth is a difficult task. Instead, we leverage a simpler single-image task, occlusion-based relative depth ordering, providing coarser but useful 3D information. We show that relative depth ordering works more reliably from occlusions than from absolute depth. We propose to solve the joint task of relative depth ordering and segmentation of instances based on occlusions. We call this task Occlusion-Ordered Semantic Instance Segmentation (OOSIS). We develop an approach to OOSIS that extracts instances and their occlusion order simultaneously from oriented occlusion boundaries and semantic segmentation. Unlike popular detect-and-segment framework for instance segmentation, combining occlusion ordering with instance segmentation allows a simple and clean formulation of OOSIS as a labeling problem. As a part of our solution for OOSIS, we develop a novel oriented occlusion boundaries approach that significantly outperforms prior work. We also develop a new joint OOSIS metric based both on instance mask accuracy and correctness of their occlusion order. We achieve better performance than strong baselines on KINS and COCOA datasets.
