Instance-aware Semantic Segmentation via Multi-task Network Cascades
Jifeng Dai, Kaiming He, Jian Sun
TL;DR
The paper tackles instance-aware semantic segmentation by decomposing the task into three interconnected sub-tasks and solving them with a cascaded, multi-task CNN that shares features across stages.A differentiable RoI warping layer enables true end-to-end training of the cascade, allowing gradients to flow through predicted box coordinates and masks.Empirical results demonstrate state-of-the-art performance on PASCAL VOC and strong COCO segmentation results, with substantial speed advantages (around 360 ms per image on VGG-16) due to shared computations and avoidance of external mask proposals.The approach is extensible to deeper cascades (e.g., 5-stage) and deeper backbones (e.g., ResNet-101), illustrating the method’s scalability and practical impact for real-time, high-quality instance segmentation.
Abstract
Semantic segmentation research has recently witnessed rapid progress, but many leading methods are unable to identify object instances. In this paper, we present Multi-task Network Cascades for instance-aware semantic segmentation. Our model consists of three networks, respectively differentiating instances, estimating masks, and categorizing objects. These networks form a cascaded structure, and are designed to share their convolutional features. We develop an algorithm for the nontrivial end-to-end training of this causal, cascaded structure. Our solution is a clean, single-step training framework and can be generalized to cascades that have more stages. We demonstrate state-of-the-art instance-aware semantic segmentation accuracy on PASCAL VOC. Meanwhile, our method takes only 360ms testing an image using VGG-16, which is two orders of magnitude faster than previous systems for this challenging problem. As a by product, our method also achieves compelling object detection results which surpass the competitive Fast/Faster R-CNN systems. The method described in this paper is the foundation of our submissions to the MS COCO 2015 segmentation competition, where we won the 1st place.
