Advances in Deep Concealed Scene Understanding

Deng-Ping Fan; Ge-Peng Ji; Peng Xu; Ming-Ming Cheng; Christos Sakaridis; Luc Van Gool

Advances in Deep Concealed Scene Understanding

Deng-Ping Fan, Ge-Peng Ji, Peng Xu, Ming-Ming Cheng, Christos Sakaridis, Luc Van Gool

TL;DR

This work surveys deep learning-driven Concealed Scene Understanding (CSU), delineating image- and video-level tasks (COS, COL, CIR, CIS, COC, VCOD, VCOS) and their formulations. It contributes the largest COS benchmark, introduces CDS2K for industrial defect concealment, and discusses open problems and directions, including domain adaptation and data-efficient learning. The paper provides extensive quantitative and qualitative benchmarks across COD10K, NC4K, CAMO, and the CDS2K dataset, highlighting the rise of transformer-based methods (e.g., CamoFormer, HitNet) for improved detection of camouflaged objects and the ongoing need for cross-domain generalization. The findings underscore the potential of semantic-level reasoning and vision-language integration to bridge human and machine understanding in concealed scenes, with practical impact in safety, industry inspection, and medical imaging.

Abstract

Concealed scene understanding (CSU) is a hot computer vision topic aiming to perceive objects exhibiting camouflage. The current boom in terms of techniques and applications warrants an up-to-date survey. This can help researchers to better understand the global CSU field, including both current achievements and remaining challenges. This paper makes four contributions: (1) For the first time, we present a comprehensive survey of deep learning techniques aimed at CSU, including a taxonomy, task-specific challenges, and ongoing developments. (2) To allow for an authoritative quantification of the state-of-the-art, we offer the largest and latest benchmark for concealed object segmentation (COS). (3) To evaluate the generalizability of deep CSU in practical scenarios, we collect the largest concealed defect segmentation dataset termed CDS2K with the hard cases from diversified industrial scenarios, on which we construct a comprehensive benchmark. (4) We discuss open problems and potential research directions for CSU. Our code and datasets are available at https://github.com/DengPingFan/CSU, which will be updated continuously to watch and summarize the advancements in this rapidly evolving field.

Advances in Deep Concealed Scene Understanding

TL;DR

Abstract

Paper Structure (32 sections, 5 equations, 6 figures, 8 tables)

This paper contains 32 sections, 5 equations, 6 figures, 8 tables.

Introduction
Background
Task Taxonomy and Formulation
Image-level CSU
Video-level CSU
Task Relationship
Related Topics
Deep CSU Models
Image-level CSU Models
Concealed Object Segmentation
Concealed Instance Ranking
Concealed Instance Segmentation
Concealed Object Counting
Video-level CSU Models
Video Concealed Object Detection
...and 17 more sections

Figures (6)

Figure 1: Sample gallery of concealed scenarios. (a-d) show natural animals selected from fan2020camouflaged. (e) depicts a concealed human in art from le2019anabranch. (f) features a synthesized "lion" by zhang2020deep.
Figure 2: Illustration of the representative CSU tasks. Five of these are image-level tasks: (a) concealed object segmentation (COS), (b) concealed object localization (COL), (c) concealed instance ranking (CIR), (d) concealed instance segmentation (CIS), and (e) concealed object counting (COC). The remaining two are video-level tasks: (f) video concealed object detection (VCOD) and (g) video concealed object segmentation (VCOS). Each task has its own corresponding annotation visualization, which is explained in detail in §\ref{['sec:task_taxonomy_and_formulation']}.
Figure 3: Network architectures for COS at a glance. We present four types of frameworks from left to right: (a) multi-stream framework, (b) bottom-up/top-down framework and its variant with deep supervision (optional), and (c) branched framework. See §\ref{['sec:cos_review']} for more details.
Figure 4: Qualitative results of ten COS approaches. More descriptions on visual attributes in each column refer to §\ref{['sec:cos_qualitative_comparison']}.
Figure 5: Sample gallery of our CDS2K. It is collected from five sub-databases: (a-l) MVTecAD, (m-o) NEU, (p) CrackForest, (q) KolektorSDD, and (r) MagneticTile. The defective regions are highlighted with red rectangles. (Top-Right) Word cloud visualization of CDS2K. (Bottom) The statistic number of positive/negative samples of each category in our CDS2K.
...and 1 more figures

Advances in Deep Concealed Scene Understanding

TL;DR

Abstract

Advances in Deep Concealed Scene Understanding

Authors

TL;DR

Abstract

Table of Contents

Figures (6)