Table of Contents
Fetching ...

CanvasPic: An Interactive Tool for Freely Generating Facial Images Based on Spatial Layout

Jiafu Wei, Chia-Ming Chang, Xi Yang, Takeo Igarashi

TL;DR

CanvasPic introduces a 2D spatial-layout interface for GAN-based facial image generation that lets users import real-world reference images and adjust attribute influence by modulating distances in the layout. It combines a pre-trained $e4e$ encoder with StyleGAN2 and employs local and global attribute transfer guided by distance-weighted influence, enabling intuitive, flexible control. A user study with 24 participants shows CanvasPic outperforms baselines in flexibility, ease of use, intuitiveness, and generated results, with strong willingness-to-use by users. The work demonstrates the practical impact of human-centered, spatial-layout designs for controllable image generation and outlines directions for broader-domain applications and automation of attribute extraction.

Abstract

In real-world usage, existing GAN image generation tools come up short due to their lack of intuitive interfaces and limited flexibility. To overcome these limitations, we developed CanvasPic, an innovative tool for flexible GAN image generation. Our tool introduces a novel 2D layout design that allows users to intuitively control image attributes based on real-world images. By interacting with the distances between images in the spatial layout, users are able to conveniently control the influence of each attribute on the target image and explore a wide range of generated results. Considering practical application scenarios, a user study involving 24 participants was conducted to compare our tool with existing tools in GAN image generation. The results of the study demonstrate that our tool significantly enhances the user experience, enabling more effective achievement of desired generative results.

CanvasPic: An Interactive Tool for Freely Generating Facial Images Based on Spatial Layout

TL;DR

CanvasPic introduces a 2D spatial-layout interface for GAN-based facial image generation that lets users import real-world reference images and adjust attribute influence by modulating distances in the layout. It combines a pre-trained encoder with StyleGAN2 and employs local and global attribute transfer guided by distance-weighted influence, enabling intuitive, flexible control. A user study with 24 participants shows CanvasPic outperforms baselines in flexibility, ease of use, intuitiveness, and generated results, with strong willingness-to-use by users. The work demonstrates the practical impact of human-centered, spatial-layout designs for controllable image generation and outlines directions for broader-domain applications and automation of attribute extraction.

Abstract

In real-world usage, existing GAN image generation tools come up short due to their lack of intuitive interfaces and limited flexibility. To overcome these limitations, we developed CanvasPic, an innovative tool for flexible GAN image generation. Our tool introduces a novel 2D layout design that allows users to intuitively control image attributes based on real-world images. By interacting with the distances between images in the spatial layout, users are able to conveniently control the influence of each attribute on the target image and explore a wide range of generated results. Considering practical application scenarios, a user study involving 24 participants was conducted to compare our tool with existing tools in GAN image generation. The results of the study demonstrate that our tool significantly enhances the user experience, enabling more effective achievement of desired generative results.
Paper Structure (18 sections, 4 figures, 1 table)

This paper contains 18 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: CanvasPic workflow: (a) Importing reference images and the target image. (b) Placing reference images within the spatial layout workspace. (c) Selecting attributes from the reference images. (d) Controlling the generated results by adjusting the distance between the images. Users can then return to step (b) or (c) to continue the image generation process.
  • Figure 2: Our interface. The interface comprises four components: a) Reference image bar, b) Spatial layout workspace, c) Results panel, and d) History module.
  • Figure 3: Comparison of the results generated by different influence intensities of local attributes. CanvasPic utilizes image distance for intensity adjustment, while StyleCLIP and HFGI use sliders, SketchEdit cannot adjust intensity.
  • Figure 4: Comparison of the results generated by different influence intensities of global attributes.