Just a Hint: Point-Supervised Camouflaged Object Detection
Huafeng Chen, Dian Shao, Guangqian Guo, Shan Gao
TL;DR
This work tackles camouflaged object detection with only point-level supervision by introducing a point-to-region supervision strategy, an attention-regulating mask, and unsupervised contrastive learning to stabilize representations. The approach is implemented over a Pyramid Transformer backbone and is trained on the newly created P-COD dataset, enabling training with merely a single point per object. Empirical results show substantial gains over existing weakly supervised COD methods and competitive performance against fully supervised models across COD benchmarks, with demonstrated transferability to scribble supervision and salient object detection. The contributions—Hint Area Generator, Attention Regulator, and Representation Optimizer—offer a practical and scalable pathway to high-quality COD with minimal annotation effort.
Abstract
Camouflaged Object Detection (COD) demands models to expeditiously and accurately distinguish objects which conceal themselves seamlessly in the environment. Owing to the subtle differences and ambiguous boundaries, COD is not only a remarkably challenging task for models but also for human annotators, requiring huge efforts to provide pixel-wise annotations. To alleviate the heavy annotation burden, we propose to fulfill this task with the help of only one point supervision. Specifically, by swiftly clicking on each object, we first adaptively expand the original point-based annotation to a reasonable hint area. Then, to avoid partial localization around discriminative parts, we propose an attention regulator to scatter model attention to the whole object through partially masking labeled regions. Moreover, to solve the unstable feature representation of camouflaged objects under only point-based annotation, we perform unsupervised contrastive learning based on differently augmented image pairs (e.g. changing color or doing translation). On three mainstream COD benchmarks, experimental results show that our model outperforms several weakly-supervised methods by a large margin across various metrics.
