Towards Accurate Camouflaged Object Detection with Mixture Convolution and Interactive Fusion

Geng Chen; Xinrui Chen; Bo Dong; Mingchen Zhuge; Yongxiong Wang; Hongbo Bi; Jian Chen; Peng Wang; Yanning Zhang

Towards Accurate Camouflaged Object Detection with Mixture Convolution and Interactive Fusion

Geng Chen, Xinrui Chen, Bo Dong, Mingchen Zhuge, Yongxiong Wang, Hongbo Bi, Jian Chen, Peng Wang, Yanning Zhang

TL;DR

This work targets camouflaged object detection (COD) by addressing two core challenges: the need for a large receptive field to capture rich context, and effective fusion to integrate multi-level features. The authors introduce MCIF-Net, which combines Dual-Branch Mixture Convolution (DMC) for context expansion with Multi-Level Interactive Fusion (MIF) for attentive feature fusion, achieving state-of-the-art results on COD benchmarks. Key contributions include the DMC module for receptive-field enlargement, the MIF module for interactive cross-level fusion, and extensive experiments plus ablations validating their effectiveness. The approach demonstrates strong generalization and transferability, including a successful extension to polyp segmentation, and offers a practical pathway toward robust COD in challenging natural scenes.

Abstract

Camouflaged object detection (COD), which aims to identify the objects that conceal themselves into the surroundings, has recently drawn increasing research efforts in the field of computer vision. In practice, the success of deep learning based COD is mainly determined by two key factors, including (i) A significantly large receptive field, which provides rich context information, and (ii) An effective fusion strategy, which aggregates the rich multi-level features for accurate COD. Motivated by these observations, in this paper, we propose a novel deep learning based COD approach, which integrates the large receptive field and effective feature fusion into a unified framework. Specifically, we first extract multi-level features from a backbone network. The resulting features are then fed to the proposed dual-branch mixture convolution modules, each of which utilizes multiple asymmetric convolutional layers and two dilated convolutional layers to extract rich context features from a large receptive field. Finally, we fuse the features using specially-designed multilevel interactive fusion modules, each of which employs an attention mechanism along with feature interaction for effective feature fusion. Our method detects camouflaged objects with an effective fusion strategy, which aggregates the rich context information from a large receptive field. All of these designs meet the requirements of COD well, allowing the accurate detection of camouflaged objects. Extensive experiments on widely-used benchmark datasets demonstrate that our method is capable of accurately detecting camouflaged objects and outperforms the state-of-the-art methods.

Towards Accurate Camouflaged Object Detection with Mixture Convolution and Interactive Fusion

TL;DR

Abstract

Towards Accurate Camouflaged Object Detection with Mixture Convolution and Interactive Fusion

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)