Table of Contents
Fetching ...

MOGAN: Morphologic-structure-aware Generative Learning from a Single Image

Jinshu Chen, Qihui Xu, Qi Kang, MengChu Zhou

TL;DR

This work introduces a morphologic-structure-aware generative adversarial network named MOGAN that produces random samples with diverse appearances and reliable structures based on only one image and focuses on internal features, including the maintenance of rational structures and variation on appearance.

Abstract

In most interactive image generation tasks, given regions of interest (ROI) by users, the generated results are expected to have adequate diversities in appearance while maintaining correct and reasonable structures in original images. Such tasks become more challenging if only limited data is available. Recently proposed generative models complete training based on only one image. They pay much attention to the monolithic feature of the sample while ignoring the actual semantic information of different objects inside the sample. As a result, for ROI-based generation tasks, they may produce inappropriate samples with excessive randomicity and without maintaining the related objects' correct structures. To address this issue, this work introduces a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances and reliable structures based on only one image. For training for ROI, we propose to utilize the data coming from the original image being augmented and bring in a novel module to transform such augmented data into knowledge containing both structures and appearances, thus enhancing the model's comprehension of the sample. To learn the rest areas other than ROI, we employ binary masks to ensure the generation isolated from ROI. Finally, we set parallel and hierarchical branches of the mentioned learning process. Compared with other single image GAN schemes, our approach focuses on internal features including the maintenance of rational structures and variation on appearance. Experiments confirm a better capacity of our model on ROI-based image generation tasks than its competitive peers.

MOGAN: Morphologic-structure-aware Generative Learning from a Single Image

TL;DR

This work introduces a morphologic-structure-aware generative adversarial network named MOGAN that produces random samples with diverse appearances and reliable structures based on only one image and focuses on internal features, including the maintenance of rational structures and variation on appearance.

Abstract

In most interactive image generation tasks, given regions of interest (ROI) by users, the generated results are expected to have adequate diversities in appearance while maintaining correct and reasonable structures in original images. Such tasks become more challenging if only limited data is available. Recently proposed generative models complete training based on only one image. They pay much attention to the monolithic feature of the sample while ignoring the actual semantic information of different objects inside the sample. As a result, for ROI-based generation tasks, they may produce inappropriate samples with excessive randomicity and without maintaining the related objects' correct structures. To address this issue, this work introduces a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances and reliable structures based on only one image. For training for ROI, we propose to utilize the data coming from the original image being augmented and bring in a novel module to transform such augmented data into knowledge containing both structures and appearances, thus enhancing the model's comprehension of the sample. To learn the rest areas other than ROI, we employ binary masks to ensure the generation isolated from ROI. Finally, we set parallel and hierarchical branches of the mentioned learning process. Compared with other single image GAN schemes, our approach focuses on internal features including the maintenance of rational structures and variation on appearance. Experiments confirm a better capacity of our model on ROI-based image generation tasks than its competitive peers.

Paper Structure

This paper contains 12 sections, 8 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Random generation learned from a single image. We introduce an unconditional generative model which is competent for interactive ROI-based image generation tasks while based on one image only. It can generate various samples of high quality which own both rational structures and diverse appearances.
  • Figure 2: MOGAN contains two parallel hierarchical branches responsible for the generation of ROI and background. The ROI branch takes ROI cut from the original image as the training target while the background branch takes the original image with a binary mask standing for regions of background. Finally, the generated results produced from two branches can be fused into complete images which are of high quality.
  • Figure 3: Details of sub-GANs in two branches. (a) ROI branch. Generators are organised based on residual blocks mainly contain a convolution layer and a deformable convolution layer. A novel module named a style injector that transforms the augmented original image into knowledge of structures and appearances controls the style of generation through affine transforms. (b) Background branch. Generators are built based on residual blocks mainly containing two gated convolution layers. Discriminators of both branches are Markovian discriminators. Details of residual blocks in ROI branch and background branch are shown in (c) and (d) respectively.
  • Figure 4: Details of a style injector. It is a lightweight encoder essentially, which contains two bypasses producing a weight and a bias respectively. Taking the augmented original image as the input, it controls the changing direction of the generator’s dataflow through affine transforms.
  • Figure 5: Randomly generated samples. Our model can produce diverse images across different areas and topics for ROI-based image generation tasks. The generated results maintain the original structure of the objects while get plenty of changes on appearance.
  • ...and 4 more figures