Table of Contents
Fetching ...

Knowledge Rectification for Camouflaged Object Detection: Unlocking Insights from Low-Quality Data

Juwei Guan, Xiaolin Fang, Donghyun Kim, Haotian Gong, Tongxin Zhu, Zhen Ling, Ming Yang

TL;DR

This work tackles camouflaged object detection (COD) on low-quality data, where degraded edges and textures impair existing methods. It introduces KRNet, a Leader-Follower framework built on conditional diffusion models, where a frozen Leader trained on high-quality data provides gold-standard conditional and hybrid distributions to rectify knowledge learned from low-quality data carried by the Follower. Key innovations include distribution-consistency (CDC) and hybrid-distribution consistency (HDC) for knowledge rectification, cross-consistency (CC) to stabilize learning across augmented views, and a time-dependent conditional encoder (TCE) to diversify representations. Extensive experiments across CAMO, COD10K, and NC4K demonstrate that KRNet surpasses state-of-the-art COD methods under various downsampling regimes, validating the effectiveness of knowledge rectification for degraded data in dense prediction tasks.

Abstract

Low-quality data often suffer from insufficient image details, introducing an extra implicit aspect of camouflage that complicates camouflaged object detection (COD). Existing COD methods focus primarily on high-quality data, overlooking the challenges posed by low-quality data, which leads to significant performance degradation. Therefore, we propose KRNet, the first framework explicitly designed for COD on low-quality data. KRNet presents a Leader-Follower framework where the Leader extracts dual gold-standard distributions: conditional and hybrid, from high-quality data to drive the Follower in rectifying knowledge learned from low-quality data. The framework further benefits from a cross-consistency strategy that improves the rectification of these distributions and a time-dependent conditional encoder that enriches the distribution diversity. Extensive experiments on benchmark datasets demonstrate that KRNet outperforms state-of-the-art COD methods and super-resolution-assisted COD approaches, proving its effectiveness in tackling the challenges of low-quality data in COD.

Knowledge Rectification for Camouflaged Object Detection: Unlocking Insights from Low-Quality Data

TL;DR

This work tackles camouflaged object detection (COD) on low-quality data, where degraded edges and textures impair existing methods. It introduces KRNet, a Leader-Follower framework built on conditional diffusion models, where a frozen Leader trained on high-quality data provides gold-standard conditional and hybrid distributions to rectify knowledge learned from low-quality data carried by the Follower. Key innovations include distribution-consistency (CDC) and hybrid-distribution consistency (HDC) for knowledge rectification, cross-consistency (CC) to stabilize learning across augmented views, and a time-dependent conditional encoder (TCE) to diversify representations. Extensive experiments across CAMO, COD10K, and NC4K demonstrate that KRNet surpasses state-of-the-art COD methods under various downsampling regimes, validating the effectiveness of knowledge rectification for degraded data in dense prediction tasks.

Abstract

Low-quality data often suffer from insufficient image details, introducing an extra implicit aspect of camouflage that complicates camouflaged object detection (COD). Existing COD methods focus primarily on high-quality data, overlooking the challenges posed by low-quality data, which leads to significant performance degradation. Therefore, we propose KRNet, the first framework explicitly designed for COD on low-quality data. KRNet presents a Leader-Follower framework where the Leader extracts dual gold-standard distributions: conditional and hybrid, from high-quality data to drive the Follower in rectifying knowledge learned from low-quality data. The framework further benefits from a cross-consistency strategy that improves the rectification of these distributions and a time-dependent conditional encoder that enriches the distribution diversity. Extensive experiments on benchmark datasets demonstrate that KRNet outperforms state-of-the-art COD methods and super-resolution-assisted COD approaches, proving its effectiveness in tackling the challenges of low-quality data in COD.

Paper Structure

This paper contains 13 sections, 13 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: (a) Comparison of different COD methods on low-quality data. HitNet and CamoDiffusion fail to detect camouflaged object from low-quality image (LQ). HQ refers to high-quality image, and GT denotes the manually annotated masks. (b) A performance comparison of our KRNet with other SOTA methods on COD10K.
  • Figure 2: The framework of the proposed KRNet. KRNet consists of the Leader-Follower model based on a conditional diffusion model. The frozen Leader extracts stable conditional ($\bm{c}$) and hybrid ($\bm{d}$) distributions from high-quality data, while the Follower operates these distributions as gold standards to rectify knowledge learned from low-quality data. For inference, high-quality segmentation is achieved using only low-quality data and randomly sampled Gaussian noise.
  • Figure 3: The differences in the hybrid distributions between the Leader and Follower. (a)-(c) represent the differences in hybrid distributions across the first to third layers of the denoising decoder. The gradually increasing discrepancies in the hybrid distributions make it challenging for the Follower to achieve high-quality segmentation results. The introduction of HDC can further rectify these knowledge discrepancies.
  • Figure 4: Visual Quality Comparison with SOTA Methods. The figure presents various common target types, including large, small, intricate details, and humans objects. The magnified patch details reveal that crucial target information is significantly missing in low-quality images. More results are shown in Appendix.