Table of Contents
Fetching ...

Concealed Object Detection

Deng-Ping Fan, Ge-Peng Ji, Ming-Ming Cheng, Ling Shao

TL;DR

This work introduces Concealed Object Detection (COD), a task focused on identifying objects that blend into their backgrounds. It provides COD10K, the first large-scale, richly annotated COD dataset (10,000 images, 78 categories) and a simple yet strong baseline SINet that uses a two-phase search and identification process with TEM, NCD, and GRA modules. The authors establish a COD benchmark across CHAMELEON, CAMO, and COD10K, showing SINet surpasses 12 baselines and offering extensive ablations. They also discuss meaningful downstream applications in medicine, manufacturing, agriculture, art, and daily life, and lay out ten future research directions. Overall, COD10K and SINet offer a solid foundation to advance camouflaged object understanding and its cross-domain impact, with open resources including code, dataset, and online demo.

Abstract

We present the first systematic study on concealed object detection (COD), which aims to identify objects that are "perfectly" embedded in their background. The high intrinsic similarities between the concealed objects and their background make COD far more challenging than traditional object detection/segmentation. To better understand this task, we collect a large-scale dataset, called COD10K, which consists of 10,000 images covering concealed objects in diverse real-world scenarios from 78 object categories. Further, we provide rich annotations including object categories, object boundaries, challenging attributes, object-level labels, and instance-level annotations. Our COD10K is the largest COD dataset to date, with the richest annotations, which enables comprehensive concealed object understanding and can even be used to help progress several other vision tasks, such as detection, segmentation, classification, etc. Motivated by how animals hunt in the wild, we also design a simple but strong baseline for COD, termed the Search Identification Network (SINet). Without any bells and whistles, SINet outperforms 12 cutting-edge baselines on all datasets tested, making them robust, general architectures that could serve as catalysts for future research in COD. Finally, we provide some interesting findings and highlight several potential applications and future directions. To spark research in this new field, our code, dataset, and online demo are available on our project page: http://mmcheng.net/cod.

Concealed Object Detection

TL;DR

This work introduces Concealed Object Detection (COD), a task focused on identifying objects that blend into their backgrounds. It provides COD10K, the first large-scale, richly annotated COD dataset (10,000 images, 78 categories) and a simple yet strong baseline SINet that uses a two-phase search and identification process with TEM, NCD, and GRA modules. The authors establish a COD benchmark across CHAMELEON, CAMO, and COD10K, showing SINet surpasses 12 baselines and offering extensive ablations. They also discuss meaningful downstream applications in medicine, manufacturing, agriculture, art, and daily life, and lay out ten future research directions. Overall, COD10K and SINet offer a solid foundation to advance camouflaged object understanding and its cross-domain impact, with open resources including code, dataset, and online demo.

Abstract

We present the first systematic study on concealed object detection (COD), which aims to identify objects that are "perfectly" embedded in their background. The high intrinsic similarities between the concealed objects and their background make COD far more challenging than traditional object detection/segmentation. To better understand this task, we collect a large-scale dataset, called COD10K, which consists of 10,000 images covering concealed objects in diverse real-world scenarios from 78 object categories. Further, we provide rich annotations including object categories, object boundaries, challenging attributes, object-level labels, and instance-level annotations. Our COD10K is the largest COD dataset to date, with the richest annotations, which enables comprehensive concealed object understanding and can even be used to help progress several other vision tasks, such as detection, segmentation, classification, etc. Motivated by how animals hunt in the wild, we also design a simple but strong baseline for COD, termed the Search Identification Network (SINet). Without any bells and whistles, SINet outperforms 12 cutting-edge baselines on all datasets tested, making them robust, general architectures that could serve as catalysts for future research in COD. Finally, we provide some interesting findings and highlight several potential applications and future directions. To spark research in this new field, our code, dataset, and online demo are available on our project page: http://mmcheng.net/cod.

Paper Structure

This paper contains 38 sections, 3 equations, 27 figures, 6 tables.

Figures (27)

  • Figure 1: Examples of background matching camouflage (BMC). There are seven and six birds for the left and right image, respectively. Answers in color are shown in Fig. \ref{['fig:Answer']}.
  • Figure 2: Task relationship. Given an input image (a), we present the ground-truth for (b) panoptic segmentation kirillov2019panoptic (which detects generic objects liu2019deepmedioni2009generic including stuff and things), (c) instance level salient object detection li2017instanceFan2021SOC, and (d) the proposed concealed object detection task, where the goal is to detect objects that have a similar pattern to the natural environment. In this example, the boundaries of the two butterflies are blended with the bananas, making them difficult to identify.
  • Figure 3: Annotation diversity in the proposed COD10K dataset. Instead of only providing coarse-grained object-level annotations like in previous works, we offer six different annotations for each image, which include attributes and categories ($1^{st}$ row), bounding boxes ($2^{nd}$ row), object annotation ($3^{rd}$ row), instance annotation ($4^{th}$ row), and edge annotation ($5^{th}$ row).
  • Figure 4: Examples of sub-classes. Please refer to supplementary materials for other sub-classes.
  • Figure 5: Object and instance distributions of each concealed category in the COD10K.COD10K consists of 5,066 concealed images from 69 categories. Zoom in for best view.
  • ...and 22 more figures