Mesh RAG: Retrieval Augmentation for Autoregressive Mesh Generation
Xiatao Sun, Chen Liang, Qian Wang, Daniel Rakita
TL;DR
This work tackles the bottleneck of quality-speed trade-offs and limited editability in autoregressive 3D mesh generation. It introduces Mesh RAG, a training-free, plug-and-play framework that segments a point-cloud prompt, generates parts in parallel, and uses a two-stage transformation retrieval (coarse AABB alignment followed by ICP refinement) to place parts coherently, enabling incremental editing without retraining. Across multiple autoregressive baselines, Mesh RAG yields substantial gains in geometric fidelity and, for larger models, faster inference, while enabling precise, localized edits. The approach generalizes to multi-modal prompts via an intermediate SLAT representation, with open-source implementation to spur further retrieval-augmented mesh research.
Abstract
3D meshes are a critical building block for applications ranging from industrial design and gaming to simulation and robotics. Traditionally, meshes are crafted manually by artists, a process that is time-intensive and difficult to scale. To automate and accelerate this asset creation, autoregressive models have emerged as a powerful paradigm for artistic mesh generation. However, current methods to enhance quality typically rely on larger models or longer sequences that result in longer generation time, and their inherent sequential nature imposes a severe quality-speed trade-off. This sequential dependency also significantly complicates incremental editing. To overcome these limitations, we propose Mesh RAG, a novel, training-free, plug-and-play framework for autoregressive mesh generation models. Inspired by RAG for language models, our approach augments the generation process by leveraging point cloud segmentation, spatial transformation, and point cloud registration to retrieve, generate, and integrate mesh components. This retrieval-based approach decouples generation from its strict sequential dependency, facilitating efficient and parallelizable inference. We demonstrate the wide applicability of Mesh RAG across various foundational autoregressive mesh generation models, showing it significantly enhances mesh quality, accelerates generation speed compared to sequential part prediction, and enables incremental editing, all without model retraining.
