Table of Contents
Fetching ...

Foreground Focus: Enhancing Coherence and Fidelity in Camouflaged Image Generation

Pei-Chi Chen, Yi Yao, Chan-Feng Hsu, HongXia Xie, Hung-Jen Chen, Hong-Han Shuai, Wen-Huang Cheng

TL;DR

The paper tackles camouflaged image generation under data scarcity by introducing FACIG, a latent-diffusion framework that explicitly aligns foreground and background through a Foreground-Aware Feature Integration Module (FAFIM) and a Foreground-Aware Denoising Loss. FACIG retrieves rich background knowledge from a pre-trained codebook and fuses it with foreground cues using attention-based fusion, while the loss upweights foreground reconstruction to preserve small object details. Experiments on COD10K and CAMO demonstrate substantial gains in overall camouflaged image quality (FID, KID) and foreground fidelity (PSNR, SSIM), especially for small objects, outperforming prior foreground- and background-guided methods. The approach yields more coherent, natural-looking camouflaged scenes, suggesting strong practical impact for data-efficient camouflaged vision perception tasks. The methods and quantitative gains indicate notable progress toward realistic, coherent camouflaged image synthesis with improved foreground integrity.

Abstract

Camouflaged image generation is emerging as a solution to data scarcity in camouflaged vision perception, offering a cost-effective alternative to data collection and labeling. Recently, the state-of-the-art approach successfully generates camouflaged images using only foreground objects. However, it faces two critical weaknesses: 1) the background knowledge does not integrate effectively with foreground features, resulting in a lack of foreground-background coherence (e.g., color discrepancy); 2) the generation process does not prioritize the fidelity of foreground objects, which leads to distortion, particularly for small objects. To address these issues, we propose a Foreground-Aware Camouflaged Image Generation (FACIG) model. Specifically, we introduce a Foreground-Aware Feature Integration Module (FAFIM) to strengthen the integration between foreground features and background knowledge. In addition, a Foreground-Aware Denoising Loss is designed to enhance foreground reconstruction supervision. Experiments on various datasets show our method outperforms previous methods in overall camouflaged image quality and foreground fidelity.

Foreground Focus: Enhancing Coherence and Fidelity in Camouflaged Image Generation

TL;DR

The paper tackles camouflaged image generation under data scarcity by introducing FACIG, a latent-diffusion framework that explicitly aligns foreground and background through a Foreground-Aware Feature Integration Module (FAFIM) and a Foreground-Aware Denoising Loss. FACIG retrieves rich background knowledge from a pre-trained codebook and fuses it with foreground cues using attention-based fusion, while the loss upweights foreground reconstruction to preserve small object details. Experiments on COD10K and CAMO demonstrate substantial gains in overall camouflaged image quality (FID, KID) and foreground fidelity (PSNR, SSIM), especially for small objects, outperforming prior foreground- and background-guided methods. The approach yields more coherent, natural-looking camouflaged scenes, suggesting strong practical impact for data-efficient camouflaged vision perception tasks. The methods and quantitative gains indicate notable progress toward realistic, coherent camouflaged image synthesis with improved foreground integrity.

Abstract

Camouflaged image generation is emerging as a solution to data scarcity in camouflaged vision perception, offering a cost-effective alternative to data collection and labeling. Recently, the state-of-the-art approach successfully generates camouflaged images using only foreground objects. However, it faces two critical weaknesses: 1) the background knowledge does not integrate effectively with foreground features, resulting in a lack of foreground-background coherence (e.g., color discrepancy); 2) the generation process does not prioritize the fidelity of foreground objects, which leads to distortion, particularly for small objects. To address these issues, we propose a Foreground-Aware Camouflaged Image Generation (FACIG) model. Specifically, we introduce a Foreground-Aware Feature Integration Module (FAFIM) to strengthen the integration between foreground features and background knowledge. In addition, a Foreground-Aware Denoising Loss is designed to enhance foreground reconstruction supervision. Experiments on various datasets show our method outperforms previous methods in overall camouflaged image quality and foreground fidelity.

Paper Structure

This paper contains 15 sections, 15 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Comparison between the generated images by our method and LAKE-RED lake. The top row shows the complete images, while the bottom row presents zoomed-in views of the highlighted areas. There exists a lack of coherence between foreground and background in LAKE-RED, while our method successfully enhances foreground-background coherence and foreground fidelity.
  • Figure 2: Overall framework of FACIG, which comprises four stages: (1) compression of the input image into a latent space. (2) background knowledge retrieval from the codebook. (3) foreground-aware feature integration, where foreground features are integrated into the background knowledge, and (4) denoising process guided by the integrated features.
  • Figure 3: Qualitative results of our method and SOTA image generation methods. The first two columns are the original images and the object masks.
  • Figure 4: User study evaluating human preferences for camouflaged images generated by our method and SOTA methods.