Table of Contents
Fetching ...

Structure-Aware Human Body Reshaping with Adaptive Affinity-Graph Network

Qiwen Deng, Yangcen Liu, Wen Li, Guoqing Wang

TL;DR

A novel Adaptive Affinity-Graph Network (AAGN), which extracts the global affinity between different body parts to enhance the quality of the generated optical flow and surpassing all previous work to achieve state-of-the-art in all evaluation metrics.

Abstract

Given a source portrait, the automatic human body reshaping task aims at editing it to an aesthetic body shape. As the technology has been widely used in media, several methods have been proposed mainly focusing on generating optical flow to warp the body shape. However, those previous works only consider the local transformation of different body parts (arms, torso, and legs), ignoring the global affinity, and limiting the capacity to ensure consistency and quality across the entire body. In this paper, we propose a novel Adaptive Affinity-Graph Network (AAGN), which extracts the global affinity between different body parts to enhance the quality of the generated optical flow. Specifically, our AAGN primarily introduces the following designs: (1) we propose an Adaptive Affinity-Graph (AAG) Block that leverages the characteristic of a fully connected graph. AAG represents different body parts as nodes in an adaptive fully connected graph and captures all the affinities between nodes to obtain a global affinity map. The design could better improve the consistency between body parts. (2) Besides, for high-frequency details are crucial for photo aesthetics, a Body Shape Discriminator (BSD) is designed to extract information from both high-frequency and spatial domain. Particularly, an SRM filter is utilized to extract high-frequency details, which are combined with spatial features as input to the BSD. With this design, BSD guides the Flow Generator (FG) to pay attention to various fine details rather than rigid pixel-level fitting. Extensive experiments conducted on the BR-5K dataset demonstrate that our framework significantly enhances the aesthetic appeal of reshaped photos, surpassing all previous work to achieve state-of-the-art in all evaluation metrics.

Structure-Aware Human Body Reshaping with Adaptive Affinity-Graph Network

TL;DR

A novel Adaptive Affinity-Graph Network (AAGN), which extracts the global affinity between different body parts to enhance the quality of the generated optical flow and surpassing all previous work to achieve state-of-the-art in all evaluation metrics.

Abstract

Given a source portrait, the automatic human body reshaping task aims at editing it to an aesthetic body shape. As the technology has been widely used in media, several methods have been proposed mainly focusing on generating optical flow to warp the body shape. However, those previous works only consider the local transformation of different body parts (arms, torso, and legs), ignoring the global affinity, and limiting the capacity to ensure consistency and quality across the entire body. In this paper, we propose a novel Adaptive Affinity-Graph Network (AAGN), which extracts the global affinity between different body parts to enhance the quality of the generated optical flow. Specifically, our AAGN primarily introduces the following designs: (1) we propose an Adaptive Affinity-Graph (AAG) Block that leverages the characteristic of a fully connected graph. AAG represents different body parts as nodes in an adaptive fully connected graph and captures all the affinities between nodes to obtain a global affinity map. The design could better improve the consistency between body parts. (2) Besides, for high-frequency details are crucial for photo aesthetics, a Body Shape Discriminator (BSD) is designed to extract information from both high-frequency and spatial domain. Particularly, an SRM filter is utilized to extract high-frequency details, which are combined with spatial features as input to the BSD. With this design, BSD guides the Flow Generator (FG) to pay attention to various fine details rather than rigid pixel-level fitting. Extensive experiments conducted on the BR-5K dataset demonstrate that our framework significantly enhances the aesthetic appeal of reshaped photos, surpassing all previous work to achieve state-of-the-art in all evaluation metrics.
Paper Structure (20 sections, 15 equations, 10 figures, 7 tables)

This paper contains 20 sections, 15 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Examples of the primary processes of our methods. (a) To reshape the human body, our Adaptive Affinity-Graph Network (AAGN) first extracts its skeleton map. (b) An affinity graph is constructed to regulate the consistency of human body parts. (c) The optical flow for warping is estimated. (d) Finally, the images are warped with the flow.
  • Figure 2: Overall pipeline of AAGN. The pipeline is mainly composed of Adaptive Affinity-Graph Block (AAG), Flow Generator (FG), and Body Shape Discriminator (BSD). First, the original photo and skeleton maps are downsampled. Then, in AAG, different body parts are separately encoded with CNN as an input of the Affinity-Graph. The body part features are reweighted through CBAM with the calculated affinity graph. In FG, conditioned by the output of AAG, optical flow for warping is estimated. The optical flow is then used to perform a warp operation on the downsampled original photo. In the training process, the BSD helps regulate the shape of the predicted photo to improve aesthetics.
  • Figure 3: The SRM filters consist of three different filters. We use these three to filter the photo, getting high-frequency features. The Body Shape Discriminator can distinguish photos more accurately based on both these features.
  • Figure 4: Visual comparisons among four optical flow-based methods, and the attention map of the estimated flow. Our method can produce high-resolution, believable, and consistent human body reshaping results. The attention map shows the more accurate grounding results of the torso, arms, and legs.
  • Figure 5: An example of visualization of the $W$ and $A$ in Adaptive Affinity-Graph Block. The red bounding boxes marks a sample of high attention region in $W_{i-j}$ and $A_{i-j}$.
  • ...and 5 more figures