One-Shot Learning for Semantic Segmentation
Amirreza Shaban, Shray Bansal, Zhen Liu, Irfan Essa, Byron Boots
TL;DR
The paper tackles efficient semantic segmentation for unseen classes by learning a conditioning mechanism that generates per-image FCN parameters from a single labeled support example. A two-branch network uses these parameters to classify dense per-pixel features from a query image, enabling fast one-shot segmentation and a straightforward extension to $k$-shot via OR-aggregation without retraining. On the PASCAL-5^i benchmark, the method achieves substantial gains over baselines (notably 1-shot) and offers strong speed advantages, with pretraining further boosting generalization. The work also introduces a dedicated benchmark for $k$-shot segmentation and demonstrates the practical feasibility of meta-learning for dense prediction tasks.
Abstract
Low-shot learning methods for image classification support learning from sparse data. We extend these techniques to support dense semantic image segmentation. Specifically, we train a network that, given a small set of annotated images, produces parameters for a Fully Convolutional Network (FCN). We use this FCN to perform dense pixel-level prediction on a test image for the new semantic class. Our architecture shows a 25% relative meanIoU improvement compared to the best baseline methods for one-shot segmentation on unseen classes in the PASCAL VOC 2012 dataset and is at least 3 times faster.
