Table of Contents
Fetching ...

Referring Camouflaged Object Detection

Xuying Zhang, Bowen Yin, Zheng Lin, Qibin Hou, Deng-Ping Fan, Ming-Ming Cheng

TL;DR

The paper introduces Ref-COD, a task to segment specified camouflaged objects using referring salient images. It proposes a dual-branch framework, R2CNet, that learns common representations from references and uses Referring Mask Generation and Referring Feature Enrichment to guide segmentation, outperforming standard COD baselines. A large-scale dataset, R2C7K, is built with Camo- and Ref- subsets across 64 categories to support this new benchmark, along with rigorous experiments and ablations. The approach demonstrates that reference-guided segmentation can significantly improve accuracy and robustness in camouflage scenarios, offering practical benefits for real-world detection and search tasks. The work also suggests broad future directions, including multi-modal references and extensions to related vision tasks.

Abstract

We consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified camouflaged objects based on a small set of referring images with salient target objects. We first assemble a large-scale dataset, called R2C7K, which consists of 7K images covering 64 object categories in real-world scenarios. Then, we develop a simple but strong dual-branch framework, dubbed R2CNet, with a reference branch embedding the common representations of target objects from referring images and a segmentation branch identifying and segmenting camouflaged objects under the guidance of the common representations. In particular, we design a Referring Mask Generation module to generate pixel-level prior mask and a Referring Feature Enrichment module to enhance the capability of identifying specified camouflaged objects. Extensive experiments show the superiority of our Ref-COD methods over their COD counterparts in segmenting specified camouflaged objects and identifying the main body of target objects. Our code and dataset are publicly available at https://github.com/zhangxuying1004/RefCOD.

Referring Camouflaged Object Detection

TL;DR

The paper introduces Ref-COD, a task to segment specified camouflaged objects using referring salient images. It proposes a dual-branch framework, R2CNet, that learns common representations from references and uses Referring Mask Generation and Referring Feature Enrichment to guide segmentation, outperforming standard COD baselines. A large-scale dataset, R2C7K, is built with Camo- and Ref- subsets across 64 categories to support this new benchmark, along with rigorous experiments and ablations. The approach demonstrates that reference-guided segmentation can significantly improve accuracy and robustness in camouflage scenarios, offering practical benefits for real-world detection and search tasks. The work also suggests broad future directions, including multi-modal references and extensions to related vision tasks.

Abstract

We consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified camouflaged objects based on a small set of referring images with salient target objects. We first assemble a large-scale dataset, called R2C7K, which consists of 7K images covering 64 object categories in real-world scenarios. Then, we develop a simple but strong dual-branch framework, dubbed R2CNet, with a reference branch embedding the common representations of target objects from referring images and a segmentation branch identifying and segmenting camouflaged objects under the guidance of the common representations. In particular, we design a Referring Mask Generation module to generate pixel-level prior mask and a Referring Feature Enrichment module to enhance the capability of identifying specified camouflaged objects. Extensive experiments show the superiority of our Ref-COD methods over their COD counterparts in segmenting specified camouflaged objects and identifying the main body of target objects. Our code and dataset are publicly available at https://github.com/zhangxuying1004/RefCOD.
Paper Structure (20 sections, 7 equations, 12 figures, 8 tables)

This paper contains 20 sections, 7 equations, 12 figures, 8 tables.

Figures (12)

  • Figure 1: Visual comparison between the standard COD and our Ref-COD. Given an image containing multiple camouflaged objects, the COD model tends to find all possible camouflaged objects that are blended into the background without discrimination, while the Ref-COD model attempts to identify the specified camouflaged objects under the condition of a set of referring images.
  • Figure 2: Examples from our R2C7K dataset. Note that the camouflaged objects in Camo- subset are masked with their annotations in orange.
  • Figure 3: Comparisons of the attributes between Camo- subset and Ref- subset, i.e., Objects Area, Object Ratio, Object Distance, and Global Contrast. The results of the former are shown in orange, while the ones of the latter are shown in blue.
  • Figure 4: Taxonomic system and log number distribution of our R2C7K dataset. Note that the results of Camo- subset are shown in red, and the ones of Ref- subset are shown in blue.
  • Figure 5: Image resolution distributions in Camo- subset and Ref- subset.
  • ...and 7 more figures