Table of Contents
Fetching ...

PEGAsus: 3D Personalization of Geometry and Appearance

Jingyu Hu, Bin Hu, Ka-Hei Hui, Haipeng Li, Zhengzhe Liu, Daniel Cohen-Or, Chi-Wing Fu

TL;DR

This paper tackles 3D personalization by learning reusable geometry and appearance concepts from a reference shape to guide text-conditioned generation. It decouples geometry and appearance using TRELLIS's two-stage pipeline and introduces global and region-wise concept learning with a progressive optimization strategy. A joint representation—learnable text embeddings paired with fine-tuned generators—enables flexible composition with text prompts to synthesize diverse, cross-category shapes. Extensive experiments on Objaverse-XL show PEGAsus achieving superior fidelity and controllability for both geometry and appearance, with ablations validating the necessity of progressive optimization and region-wise losses.

Abstract

We present PEGAsus, a new framework capable of generating Personalized 3D shapes by learning shape concepts at both Geometry and Appearance levels. First, we formulate 3D shape personalization as extracting reusable, category-agnostic geometric and appearance attributes from reference shapes, and composing these attributes with text to generate novel shapes. Second, we design a progressive optimization strategy to learn shape concepts at both the geometry and appearance levels, decoupling the shape concept learning process. Third, we extend our approach to region-wise concept learning, enabling flexible concept extraction, with context-aware and context-free losses. Extensive experimental results show that PEGAsus is able to effectively extract attributes from a wide range of reference shapes and then flexibly compose these concepts with text to synthesize new shapes. This enables fine-grained control over shape generation and supports the creation of diverse, personalized results, even in challenging cross-category scenarios. Both quantitative and qualitative experiments demonstrate that our approach outperforms existing state-of-the-art solutions.

PEGAsus: 3D Personalization of Geometry and Appearance

TL;DR

This paper tackles 3D personalization by learning reusable geometry and appearance concepts from a reference shape to guide text-conditioned generation. It decouples geometry and appearance using TRELLIS's two-stage pipeline and introduces global and region-wise concept learning with a progressive optimization strategy. A joint representation—learnable text embeddings paired with fine-tuned generators—enables flexible composition with text prompts to synthesize diverse, cross-category shapes. Extensive experiments on Objaverse-XL show PEGAsus achieving superior fidelity and controllability for both geometry and appearance, with ablations validating the necessity of progressive optimization and region-wise losses.

Abstract

We present PEGAsus, a new framework capable of generating Personalized 3D shapes by learning shape concepts at both Geometry and Appearance levels. First, we formulate 3D shape personalization as extracting reusable, category-agnostic geometric and appearance attributes from reference shapes, and composing these attributes with text to generate novel shapes. Second, we design a progressive optimization strategy to learn shape concepts at both the geometry and appearance levels, decoupling the shape concept learning process. Third, we extend our approach to region-wise concept learning, enabling flexible concept extraction, with context-aware and context-free losses. Extensive experimental results show that PEGAsus is able to effectively extract attributes from a wide range of reference shapes and then flexibly compose these concepts with text to synthesize new shapes. This enables fine-grained control over shape generation and supports the creation of diverse, personalized results, even in challenging cross-category scenarios. Both quantitative and qualitative experiments demonstrate that our approach outperforms existing state-of-the-art solutions.
Paper Structure (31 sections, 10 equations, 13 figures, 3 tables)

This paper contains 31 sections, 10 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: PEGAsus is capable of extracting shape concepts for the global appearance of a watermelon, region-wise appearance of a stripe pattern, global geometry of a flower, and region-wise geometry of frog legs, then composing these learned concepts with text prompts in different ways to synthesize the four new shapes shown above, e.g., a chair whose lower part follows the frog legs and appearance follows the watermelon.
  • Figure 2: Global concept learning in PEGAsus. Given a reference shape, we first optimize the text embedding $S_{global}$ to $S'_{global}$ then fine-tune the geometry generator $\mathcal{G}_{geo}$ to $\mathcal{G}'_{geo}$; both steps are guided by the global (geometry) loss $\mathcal{L}_{geo}^{global}$. At inference, we compose the learned concept $S'_{global}$ with a text prompt and use the fine-tuned model to generate new shapes with the inherited reference attributes. This figure shows global concept learning for geometry, yet the same pipeline can be used for appearance, by replacing $\mathcal{G}_{geo}$ with the appearance generator and adjusting the reference data from geometry to appearance.
  • Figure 3: Region-wise concept learning in PEGAsus. Given a reference appearance with a user-specified region, we perform a progressive optimization strategy with a region-wise (appearance) objective $\mathcal{L}_{app}^{region}$. This objective combines $\mathcal{L}_{app}^{ctx}$ and $\mathcal{L}_{app}^{free}$ to encourage the learned concept to be visually coherent and conceptually independent. At inference, the concept synthesizes result that inherits the attributes of the specified region. While this figure shows region-wise concept learning for appearance, the pipeline applies to geometry by replacing $G_{app}$ with $G_{geo}$ and switching reference data to geometry.
  • Figure 4: Region-wise concept learning objectives. (i) Context-Aware Loss takes full reference and computes loss within the masked region. (ii) Context-Free Loss uses masked input to calculate loss within the masked region.
  • Figure 5: Our method supports extracting concepts from global geometry, region-wise geometry, global appearance, and region-wise appearance, and composing the learned concepts with text to generate novel shapes.
  • ...and 8 more figures