EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
Jiaxiang Tang, Zhaoshuo Li, Zekun Hao, Xian Liu, Gang Zeng, Ming-Yu Liu, Qinsheng Zhang
TL;DR
EdgeRunner introduces an auto-regressive auto-encoder (ArAE) with a novel EdgeBreaker-based mesh tokenizer to generate artistic meshes up to 4,000 faces at a resolution of $512^3$. By mapping variable-length meshes into a fixed-length latent space, it enables latent diffusion conditioned on point clouds or single-view images, improving generalization and cross-modal generation. The approach achieves higher quality and diversity than prior autoregressive mesh methods, with efficient training and competitive inference times. This work advances scalable, topology-preserving 3D mesh generation for downstream creative applications.
Abstract
Current auto-regressive mesh generation methods suffer from issues such as incompleteness, insufficient detail, and poor generalization. In this paper, we propose an Auto-regressive Auto-encoder (ArAE) model capable of generating high-quality 3D meshes with up to 4,000 faces at a spatial resolution of $512^3$. We introduce a novel mesh tokenization algorithm that efficiently compresses triangular meshes into 1D token sequences, significantly enhancing training efficiency. Furthermore, our model compresses variable-length triangular meshes into a fixed-length latent space, enabling training latent diffusion models for better generalization. Extensive experiments demonstrate the superior quality, diversity, and generalization capabilities of our model in both point cloud and image-conditioned mesh generation tasks.
