FastMesh: Efficient Artistic Mesh Generation via Component Decoupling
Jeonghwan Kim, Yushi Lan, Armando Fortes, Yongwei Chen, Xingang Pan
TL;DR
FastMesh tackles inefficiency in autoregressive mesh generation by decoupling vertex and face construction. It autoregressively generates vertices with block-wise indexing, then uses a bidirectional transformer to infer edges and assemble faces in a single step, supplemented by a fidelity enhancer and a prediction-filtering post-process. The approach achieves approximately 23% token usage and up to an 8x speedup on Toys4K while delivering higher mesh quality than prior methods. This decoupled pipeline enables faster, more robust artistic mesh generation conditioned on shape inputs and is compatible with broader 3D generation pipelines.
Abstract
Recent mesh generation approaches typically tokenize triangle meshes into sequences of tokens and train autoregressive models to generate these tokens sequentially. Despite substantial progress, such token sequences inevitably reuse vertices multiple times to fully represent manifold meshes, as each vertex is shared by multiple faces. This redundancy leads to excessively long token sequences and inefficient generation processes. In this paper, we propose an efficient framework that generates artistic meshes by treating vertices and faces separately, significantly reducing redundancy. We employ an autoregressive model solely for vertex generation, decreasing the token count to approximately 23% of that required by the most compact existing tokenizer. Next, we leverage a bidirectional transformer to complete the mesh in a single step by capturing inter-vertex relationships and constructing the adjacency matrix that defines the mesh faces. To further improve the generation quality, we introduce a fidelity enhancer to refine vertex positioning into more natural arrangements and propose a post-processing framework to remove undesirable edge connections. Experimental results show that our method achieves more than 8x faster speed on mesh generation compared to state-of-the-art approaches, while producing higher mesh quality.
