C3Net: Context-Contrast Network for Camouflaged Object Detection
Baber Jan, Aiman H. El-Maleh, Abdul Jabbar Siddiqui, Abdul Bais, Saeed Anwar
TL;DR
C3Net addresses camouflaged object detection by introducing a dual-pathway decoder that separately optimizes edge refinement and contextual localization, then fuses them via an attentive mechanism. The Edge Refinement Pathway (ERP) uses gradient- and Laplacian-initialized Edge Enhancement Modules to recover precise boundaries, while the Contextual Localization Pathway (CLP) employs Semantic Enhancement Units and the Image-based Context Guidance mechanism to suppress intrinsic saliency without external models. An Attentive Fusion Module and a triple-loss objective coordinate edge, context, and final predictions, achieving state-of-the-art performance across COD10K, CAMO, and NC4K benchmarks and demonstrating robust handling of six COD challenges. The approach shows that carefully designed specialized pathways and intrinsic saliency suppression can outperform foundation-model baselines, with practical impact for medical imaging, wildlife monitoring, and industrial inspection.
Abstract
Camouflaged object detection identifies objects that blend seamlessly with their surroundings through similar colors, textures, and patterns. This task challenges both traditional segmentation methods and modern foundation models, which fail dramatically on camouflaged objects. We identify six fundamental challenges in COD: Intrinsic Similarity, Edge Disruption, Extreme Scale Variation, Environmental Complexities, Contextual Dependencies, and Salient-Camouflaged Object Disambiguation. These challenges frequently co-occur and compound the difficulty of detection, requiring comprehensive architectural solutions. We propose C3Net, which addresses all challenges through a specialized dual-pathway decoder architecture. The Edge Refinement Pathway employs gradient-initialized Edge Enhancement Modules to recover precise boundaries from early features. The Contextual Localization Pathway utilizes our novel Image-based Context Guidance mechanism to achieve intrinsic saliency suppression without external models. An Attentive Fusion Module synergistically combines the two pathways via spatial gating. C3Net achieves state-of-the-art performance with S-measures of 0.898 on COD10K, 0.904 on CAMO, and 0.913 on NC4K, while maintaining efficient processing. C3Net demonstrates that complex, multifaceted detection challenges require architectural innovation, with specialized components working synergistically to achieve comprehensive coverage beyond isolated improvements. Code, model weights, and results are available at https://github.com/Baber-Jan/C3Net.
