Table of Contents
Fetching ...

Pomo3D: 3D-Aware Portrait Accessorizing and More

Tzu-Chieh Liu, Chih-Ting Liu, Shao-Yi Chien

TL;DR

Pomo3D enables the avatars to attain out-of-distribution appearances of simultaneously wearing multiple accessories, and introduces the Scribble2Accessories module, enabling Pomo3D to create 3D accessories from user-drawn accessory scribble maps.

Abstract

We propose Pomo3D, a 3D portrait manipulation framework that allows free accessorizing by decomposing and recomposing portraits and accessories. It enables the avatars to attain out-of-distribution (OOD) appearances of simultaneously wearing multiple accessories. Existing methods still struggle to offer such explicit and fine-grained editing; they either fail to generate additional objects on given portraits or cause alterations to portraits (e.g., identity shift) when generating accessories. This restriction presents a noteworthy obstacle as people typically seek to create charming appearances with diverse and fashionable accessories in the virtual universe. Our approach provides an effective solution to this less-addressed issue. We further introduce the Scribble2Accessories module, enabling Pomo3D to create 3D accessories from user-drawn accessory scribble maps. Moreover, we design a bias-conscious mapper to mitigate biased associations present in real-world datasets. In addition to object-level manipulation above, Pomo3D also offers extensive editing options on portraits, including global or local editing of geometry and texture and avatar stylization, elevating 3D editing of neural portraits to a more comprehensive level.

Pomo3D: 3D-Aware Portrait Accessorizing and More

TL;DR

Pomo3D enables the avatars to attain out-of-distribution appearances of simultaneously wearing multiple accessories, and introduces the Scribble2Accessories module, enabling Pomo3D to create 3D accessories from user-drawn accessory scribble maps.

Abstract

We propose Pomo3D, a 3D portrait manipulation framework that allows free accessorizing by decomposing and recomposing portraits and accessories. It enables the avatars to attain out-of-distribution (OOD) appearances of simultaneously wearing multiple accessories. Existing methods still struggle to offer such explicit and fine-grained editing; they either fail to generate additional objects on given portraits or cause alterations to portraits (e.g., identity shift) when generating accessories. This restriction presents a noteworthy obstacle as people typically seek to create charming appearances with diverse and fashionable accessories in the virtual universe. Our approach provides an effective solution to this less-addressed issue. We further introduce the Scribble2Accessories module, enabling Pomo3D to create 3D accessories from user-drawn accessory scribble maps. Moreover, we design a bias-conscious mapper to mitigate biased associations present in real-world datasets. In addition to object-level manipulation above, Pomo3D also offers extensive editing options on portraits, including global or local editing of geometry and texture and avatar stylization, elevating 3D editing of neural portraits to a more comprehensive level.
Paper Structure (46 sections, 5 equations, 31 figures, 1 table)

This paper contains 46 sections, 5 equations, 31 figures, 1 table.

Figures (31)

  • Figure 1: Through user-drawn shapes of scribbles (top row) and diverse texture selections, Pomo3D can generate personalized accessories on specified avatars (top left in GUI). These scribble maps can be directly drawn within our provided GUI, and multiple scribbles can be stacked together to achieve multiple accessories worn concurrently. All the results are multi-view consistent and operate at an interactive frame rate.
  • Figure 2: Overview. (a) Generation of dual geometry tri-planes: we construct two tri-planes for the geometry modeling of portraits and accessories. We then obtain the projected feature maps and corresponding semantic maps via volume rendering. (b) Structure-guided texture renderer: next, the structure encoder and texture renderer fuse the two projected feature maps and yield the output image. The variable $\mathnormal{Accs}$ indicates whether the accessory is worn on the portrait, and thus there are two possible outcomes. (c) Bias-conscious mapper: considering the biases in existing datasets, a bias-conscious mapper is proposed to map Gaussian noise into four latent codes for corresponding attributes. (d) Data preparation and training scheme: PAC-Mask consists of three data groups: accessory semantic maps, portrait semantic maps, and overall RGB images. During training, we use three different discriminators along with these three data groups to conduct adversarial learning on the three branches of the network.
  • Figure 3: Bias-Conscious Mapper. Our mapping network generates style codes that are aware of both pose and identity, through pose conditioning and identity conditioning. $\mathcal{W}_{acc,g}^{*}$ and $\mathcal{W}_{acc,g}$ are identity-uncorrelated and identity-correlated space, respectively.
  • Figure 4: Accessories and beards can be generated either by a random accessory geometry code (first row) or by the user's scribble map (bottom two rows). They can be created from any viewpoint, not limited to the frontal view. The two examples in the bottom right corner demonstrate that different stroke widths lead to different types of beards.
  • Figure 5: Training of Scribble2Accessories. We train the encoder $\mathnormal{E}_{acc}$ and the accessory codebook with two types of cycle consistency while fixing the pre-trained generator.
  • ...and 26 more figures