Foreground Focus: Enhancing Coherence and Fidelity in Camouflaged Image Generation
Pei-Chi Chen, Yi Yao, Chan-Feng Hsu, HongXia Xie, Hung-Jen Chen, Hong-Han Shuai, Wen-Huang Cheng
TL;DR
The paper tackles camouflaged image generation under data scarcity by introducing FACIG, a latent-diffusion framework that explicitly aligns foreground and background through a Foreground-Aware Feature Integration Module (FAFIM) and a Foreground-Aware Denoising Loss. FACIG retrieves rich background knowledge from a pre-trained codebook and fuses it with foreground cues using attention-based fusion, while the loss upweights foreground reconstruction to preserve small object details. Experiments on COD10K and CAMO demonstrate substantial gains in overall camouflaged image quality (FID, KID) and foreground fidelity (PSNR, SSIM), especially for small objects, outperforming prior foreground- and background-guided methods. The approach yields more coherent, natural-looking camouflaged scenes, suggesting strong practical impact for data-efficient camouflaged vision perception tasks. The methods and quantitative gains indicate notable progress toward realistic, coherent camouflaged image synthesis with improved foreground integrity.
Abstract
Camouflaged image generation is emerging as a solution to data scarcity in camouflaged vision perception, offering a cost-effective alternative to data collection and labeling. Recently, the state-of-the-art approach successfully generates camouflaged images using only foreground objects. However, it faces two critical weaknesses: 1) the background knowledge does not integrate effectively with foreground features, resulting in a lack of foreground-background coherence (e.g., color discrepancy); 2) the generation process does not prioritize the fidelity of foreground objects, which leads to distortion, particularly for small objects. To address these issues, we propose a Foreground-Aware Camouflaged Image Generation (FACIG) model. Specifically, we introduce a Foreground-Aware Feature Integration Module (FAFIM) to strengthen the integration between foreground features and background knowledge. In addition, a Foreground-Aware Denoising Loss is designed to enhance foreground reconstruction supervision. Experiments on various datasets show our method outperforms previous methods in overall camouflaged image quality and foreground fidelity.
