Object Fidelity Diffusion for Remote Sensing Image Generation

Ziqi Ye; Shuran Ma; Jie Yang; Xiaoyi Yang; Yi Yang; Ziyang Gong; Xue Yang; Haipeng Wang

Object Fidelity Diffusion for Remote Sensing Image Generation

Ziqi Ye, Shuran Ma, Jie Yang, Xiaoyi Yang, Yi Yang, Ziyang Gong, Xue Yang, Haipeng Wang

TL;DR

This work tackles the challenge of high-fidelity, controllable remote sensing image generation by introducing Object Fidelity Diffusion (OF-Diff), which extracts object shape priors from layouts and employs online-distillation to align diffusion outputs without real-image references. It augments diffusion with a dual-decoder architecture and a Shape Generation Module to enforce morphology-consistent objects, and adds DDPO to fine-tune the process for diversity and semantic consistency. Empirical results show OF-Diff outperforms state-of-the-art layout-to-image methods across fidelity, layout accuracy, and downstream detection metrics, with notable gains on small and polymorphic objects. The approach improves practical RS data augmentation for object detection while highlighting trade-offs between aesthetics and distribution fidelity, and it identifies mask quality as a key dependency for shaping results.

Abstract

High-precision controllable remote sensing image generation is both meaningful and challenging. Existing diffusion models often produce low-fidelity images due to their inability to adequately capture morphological details, which may affect the robustness and reliability of object detection models. To enhance the accuracy and fidelity of generated objects in remote sensing, this paper proposes Object Fidelity Diffusion (OF-Diff), which effectively improves the fidelity of generated objects. Specifically, we are the first to extract the prior shapes of objects based on the layout for diffusion models in remote sensing. Then, we introduce a dual-branch diffusion model with diffusion consistency loss, which can generate high-fidelity remote sensing images without providing real images during the sampling phase. Furthermore, we introduce DDPO to fine-tune the diffusion process, making the generated remote sensing images more diverse and semantically consistent. Comprehensive experiments demonstrate that OF-Diff outperforms state-of-the-art methods in the remote sensing across key quality metrics. Notably, the performance of several polymorphic and small object classes shows significant improvement. For instance, the mAP increases by 8.3%, 7.7%, and 4.0% for airplanes, ships, and vehicles, respectively.

Object Fidelity Diffusion for Remote Sensing Image Generation

TL;DR

Abstract

Object Fidelity Diffusion for Remote Sensing Image Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)