Recurrent Generic Contour-based Instance Segmentation with Progressive Learning
Hao Feng, Keyi Zhou, Wengang Zhou, Yufei Yin, Jiajun Deng, Qi Sun, Houqiang Li
TL;DR
This work targets contour-based instance segmentation by introducing PolySnake, a three-module framework that iteratively refines object contours. An Initial Contour Generation module provides a coarse contour, which is progressively deformed by a GRU-based update within Iterative Contour Deformation, guided by vertex features aggregated through circle-convolution. A Multi-scale Contour Refinement stage further sharpens boundaries using large-scale semantic features, aided by a shape-focused training loss that regularizes contour geometry. Across SBD, Cityscapes, COCO, KINS, and additional tasks like CTW1500 text detection and CULane lane detection, PolySnake demonstrates strong accuracy and generalization, often surpassing state-of-the-art contour-based methods while maintaining efficiency suitable for real-time use. The approach introduces an effective combination of iterative contour evolution, multi-view feature fusion, and shape-aware supervision that benefits diverse structured-geometry perception problems in vision.
Abstract
Contour-based instance segmentation has been actively studied, thanks to its flexibility and elegance in processing visual objects within complex backgrounds. In this work, we propose a novel deep network architecture, i.e., PolySnake, for generic contour-based instance segmentation. Motivated by the classic Snake algorithm, the proposed PolySnake achieves superior and robust segmentation performance with an iterative and progressive contour refinement strategy. Technically, PolySnake introduces a recurrent update operator to estimate the object contour iteratively. It maintains a single estimate of the contour that is progressively deformed toward the object boundary. At each iteration, PolySnake builds a semantic-rich representation for the current contour and feeds it to the recurrent operator for further contour adjustment. Through the iterative refinements, the contour progressively converges to a stable status that tightly encloses the object instance. Beyond the scope of general instance segmentation, extensive experiments are conducted to validate the effectiveness and generalizability of our PolySnake in two additional specific task scenarios, including scene text detection and lane detection. The results demonstrate that the proposed PolySnake outperforms the existing advanced methods on several multiple prevalent benchmarks across the three tasks. The codes and pre-trained models are available at https://github.com/fh2019ustc/PolySnake
