Table of Contents
Fetching ...

MARS: Mesh AutoRegressive Model for 3D Shape Detailization

Jingnan Gao, Weizhe Liu, Weixuan Sun, Senbo Wang, Xibin Song, Taizhang Shang, Shenzhou Chen, Hongdong Li, Xiaokang Yang, Yichao Yan, Pan Ji

TL;DR

MARS addresses the problem of producing high-fidelity 3D mesh details from coarse inputs while maintaining global shape across diverse categories. It introduces a multi-LOD 3D VQVAE with geometry-consistency supervision to tokenize meshes into discrete multi-LOD tokens and a mesh autoregressive model that predicts next-LOD tokens to progressively refine detail. The approach achieves state-of-the-art results on the 3D Shape Detailization benchmark, outperforming GAN-based detailization methods in both qualitative fidelity and quantitative metrics, and demonstrates robust performance across multi-category and out-of-distribution scenarios. This framework enables efficient, diverse, and structurally coherent 3D detailization suitable for mesh refinement and downstream rendering workflows.

Abstract

State-of-the-art methods for mesh detailization predominantly utilize Generative Adversarial Networks (GANs) to generate detailed meshes from coarse ones. These methods typically learn a specific style code for each category or similar categories without enforcing geometry supervision across different Levels of Detail (LODs). Consequently, such methods often fail to generalize across a broader range of categories and cannot ensure shape consistency throughout the detailization process. In this paper, we introduce MARS, a novel approach for 3D shape detailization. Our method capitalizes on a novel multi-LOD, multi-category mesh representation to learn shape-consistent mesh representations in latent space across different LODs. We further propose a mesh autoregressive model capable of generating such latent representations through next-LOD token prediction. This approach significantly enhances the realism of the generated shapes. Extensive experiments conducted on the challenging 3D Shape Detailization benchmark demonstrate that our proposed MARS model achieves state-of-the-art performance, surpassing existing methods in both qualitative and quantitative assessments. Notably, the model's capability to generate fine-grained details while preserving the overall shape integrity is particularly commendable.

MARS: Mesh AutoRegressive Model for 3D Shape Detailization

TL;DR

MARS addresses the problem of producing high-fidelity 3D mesh details from coarse inputs while maintaining global shape across diverse categories. It introduces a multi-LOD 3D VQVAE with geometry-consistency supervision to tokenize meshes into discrete multi-LOD tokens and a mesh autoregressive model that predicts next-LOD tokens to progressively refine detail. The approach achieves state-of-the-art results on the 3D Shape Detailization benchmark, outperforming GAN-based detailization methods in both qualitative fidelity and quantitative metrics, and demonstrates robust performance across multi-category and out-of-distribution scenarios. This framework enables efficient, diverse, and structurally coherent 3D detailization suitable for mesh refinement and downstream rendering workflows.

Abstract

State-of-the-art methods for mesh detailization predominantly utilize Generative Adversarial Networks (GANs) to generate detailed meshes from coarse ones. These methods typically learn a specific style code for each category or similar categories without enforcing geometry supervision across different Levels of Detail (LODs). Consequently, such methods often fail to generalize across a broader range of categories and cannot ensure shape consistency throughout the detailization process. In this paper, we introduce MARS, a novel approach for 3D shape detailization. Our method capitalizes on a novel multi-LOD, multi-category mesh representation to learn shape-consistent mesh representations in latent space across different LODs. We further propose a mesh autoregressive model capable of generating such latent representations through next-LOD token prediction. This approach significantly enhances the realism of the generated shapes. Extensive experiments conducted on the challenging 3D Shape Detailization benchmark demonstrate that our proposed MARS model achieves state-of-the-art performance, surpassing existing methods in both qualitative and quantitative assessments. Notably, the model's capability to generate fine-grained details while preserving the overall shape integrity is particularly commendable.

Paper Structure

This paper contains 11 sections, 8 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Overview of MARS. Our method initially employs a multi-LOD 3D Vector Quantized Variational Autoencoder (VQVAE) to tokenize input 3D meshes into discrete tokens representing multiple levels of detail. To effectively capture the geometric information across these varying levels, we have devised a geometry-consistency supervision strategy that enhances the training of the VQVAE. For the task of 3D shape detailization, we integrate a mesh autoregressive model that predicts next-LOD mesh tokens. Consequently, our model generates a detailized output that exhibits high-quality geometric details from a coarse input, thereby achieving sophisticated detail enhancement while maintaining structural integrity.
  • Figure 2: Comparison with previous detailization approaches. We conduct a comparative analysis of our model with existing detailization methods, specifically ShaDDR and DECOLLAGE. For each coarse input, we demonstrate two distinct detailization styles. It is observed that both ShaDDR and DECOLLAGE often produce outputs with compromised mesh integrity. In contrast, our method consistently generates complete outputs that exhibit high-quality geometric details, thereby underscoring the robustness and efficacy of our approach in handling complex detailization tasks.
  • Figure 3: Diverse generation results. Our approach is capable of generating a diverse array of detailed meshes from a uniform coarse input. This capability underscores the robustness and adaptability of our method in enhancing mesh detailization across various scenarios.
  • Figure 4: Reconstruction ablation of geometry-consistency supervision. It can be seen that the geometry-consistency supervision is crucial in our shape detailization process.
  • Figure 5: Comparison of reconstruction using different codebook sizes. It is apparent that our method yields comparable outcomes when employing codebook sizes of 8192 and 16384, with these results surpassing those obtained using a codebook size of 4096.
  • ...and 6 more figures