Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning
Jian Liu, Jing Xu, Song Guo, Jing Li, Jingfeng Guo, Jiaao Yu, Haohan Weng, Biwen Lei, Xianghui Yang, Zhuo Chen, Fangqi Zhu, Tao Han, Chunchao Guo
TL;DR
Mesh-RFT tackles the challenge of producing high-fidelity, topologically sound 3D meshes by moving beyond global reinforcement signals to fine-grained, face-level refinement. The method combines Masked Direct Preference Optimization (M-DPO) with a topology-aware scoring system that uses Boundary Edge Ratio ($BER$) and Topology Score ($TS$) to guide localized improvements while preserving global coherence. A three-stage pipeline—pretraining, preference dataset construction, and post-training with quality-aware masks—yields substantial gains in geometric integrity and topological regularity, outperforming pretrained baselines and global DPO methods (e.g., $HD$ reductions of up to 24.6% and $TS$ gains of up to 4.9%). The approach establishes a new benchmark for production-ready mesh generation and highlights the value of localized reinforcement learning in 3D asset creation for industry applications.
Abstract
Existing pretrained models for 3D mesh generation often suffer from data biases and produce low-quality results, while global reinforcement learning (RL) methods rely on object-level rewards that struggle to capture local structure details. To address these challenges, we present Mesh-RFT, a novel fine-grained reinforcement fine-tuning framework that employs Masked Direct Preference Optimization (M-DPO) to enable localized refinement via quality-aware face masking. To facilitate efficient quality evaluation, we introduce an objective topology-aware scoring system to evaluate geometric integrity and topological regularity at both object and face levels through two metrics: Boundary Edge Ratio (BER) and Topology Score (TS). By integrating these metrics into a fine-grained RL strategy, Mesh-RFT becomes the first method to optimize mesh quality at the granularity of individual faces, resolving localized errors while preserving global coherence. Experiment results show that our M-DPO approach reduces Hausdorff Distance (HD) by 24.6% and improves Topology Score (TS) by 3.8% over pre-trained models, while outperforming global DPO methods with a 17.4% HD reduction and 4.9% TS gain. These results demonstrate Mesh-RFT's ability to improve geometric integrity and topological regularity, achieving new state-of-the-art performance in production-ready mesh generation. Project Page: https://hitcslj.github.io/mesh-rft/.
