Table of Contents
Fetching ...

Direct May Not Be the Best: An Incremental Evolution View of Pose Generation

Yuelong Li, Tengfei Xiao, Lei Geng, Jianming Wang

TL;DR

This work reframes pose generation as a sequence of small, guided evolutions rather than a direct large-variance transfer, addressing the non-linear content changes inherent in 3D-to-2D pose projection. It introduces global and incremental evolution constraints and a triple-path knowledge fusion module to progressively synthesize target poses while preserving clothes texture and enabling intermediate poses. Experiments on Turning-Round, Fashion, and Tai-Chi datasets demonstrate competitive quantitative metrics and favorable human perceptual results, validating the approach's effectiveness and scalability. The method offers a principled framework for pose-guided generation that balances fidelity, stability, and by-product utility in real-world applications.

Abstract

Pose diversity is an inherent representative characteristic of 2D images. Due to the 3D to 2D projection mechanism, there is evident content discrepancy among distinct pose images. This is the main obstacle bothering pose transformation related researches. To deal with this challenge, we propose a fine-grained incremental evolution centered pose generation framework, rather than traditional direct one-to-one in a rush. Since proposed approach actually bypasses the theoretical difficulty of directly modeling dramatic non-linear variation, the incurred content distortion and blurring could be effectively constrained, at the same time the various individual pose details, especially clothes texture, could be precisely maintained. In order to systematically guide the evolution course, both global and incremental evolution constraints are elaborately designed and merged into the overall framework. And a novel triple-path knowledge fusion structure is worked out to take full advantage of all available valuable knowledge to conduct high-quality pose synthesis. In addition, our framework could generate a series of valuable byproducts, namely the various intermediate poses. Extensive experiments have been conducted to verify the effectiveness of the proposed approach. Code is available at https://github.com/Xiaofei-CN/Incremental-Evolution-Pose-Generation.

Direct May Not Be the Best: An Incremental Evolution View of Pose Generation

TL;DR

This work reframes pose generation as a sequence of small, guided evolutions rather than a direct large-variance transfer, addressing the non-linear content changes inherent in 3D-to-2D pose projection. It introduces global and incremental evolution constraints and a triple-path knowledge fusion module to progressively synthesize target poses while preserving clothes texture and enabling intermediate poses. Experiments on Turning-Round, Fashion, and Tai-Chi datasets demonstrate competitive quantitative metrics and favorable human perceptual results, validating the approach's effectiveness and scalability. The method offers a principled framework for pose-guided generation that balances fidelity, stability, and by-product utility in real-world applications.

Abstract

Pose diversity is an inherent representative characteristic of 2D images. Due to the 3D to 2D projection mechanism, there is evident content discrepancy among distinct pose images. This is the main obstacle bothering pose transformation related researches. To deal with this challenge, we propose a fine-grained incremental evolution centered pose generation framework, rather than traditional direct one-to-one in a rush. Since proposed approach actually bypasses the theoretical difficulty of directly modeling dramatic non-linear variation, the incurred content distortion and blurring could be effectively constrained, at the same time the various individual pose details, especially clothes texture, could be precisely maintained. In order to systematically guide the evolution course, both global and incremental evolution constraints are elaborately designed and merged into the overall framework. And a novel triple-path knowledge fusion structure is worked out to take full advantage of all available valuable knowledge to conduct high-quality pose synthesis. In addition, our framework could generate a series of valuable byproducts, namely the various intermediate poses. Extensive experiments have been conducted to verify the effectiveness of the proposed approach. Code is available at https://github.com/Xiaofei-CN/Incremental-Evolution-Pose-Generation.
Paper Structure (14 sections, 9 equations, 8 figures, 4 tables)

This paper contains 14 sections, 9 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: The basic flows of classical one-to-one pose generation (a) and proposed incremental evolution synthesis (b).
  • Figure 2: Overview of the proposed framework, where top-left is the dual input, namely the source image and target pose. The upper left of the figure demonstrates the recurrently progressive generation of global evolution constraints. The middle part shows the triple-path knowledge fusion based pose synthesizing, at iteration t, which is the core unit structure of proposed incremental evolution pose generation.
  • Figure 3: Synthesized poses and corresponding incremental evolution intermediates with skeleton and semantics on the Turning-Round and Fashion dataset.
  • Figure 4: The pose synthesizing accuracy on the Turning-Round dataset when a few of evolution intermediate increments are randomly removed from the generation flow.
  • Figure 5: Qualitative comparison of pose synthesis on the Fashion dataset. Please zoom in for better view.
  • ...and 3 more figures