End-to-End Fine-Tuning of 3D Texture Generation using Differentiable Rewards
AmirHossein Zamani, Tianhao Xie, Amir G. Aghdam, Tiberiu Popa, Eugene Belilovsky
TL;DR
The paper tackles 3D texture generation by embedding differentiable reward signals into an end-to-end 3D texture pipeline, eliminating RL while achieving geometry-aware texture synthesis. It introduces five geometry-aware rewards and employs differentiable rendering to backpropagate preferences through both geometry and appearance, enabling precise control over texture alignment with 3D structure. To make training feasible, it leverages LoRA and gradient checkpointing for memory efficiency and stability. Across qualitative, quantitative, and user studies, the method outperforms state-of-the-art baselines, demonstrating superior texture quality, geometric coherence, and user-preferred results. The work paves the way for interactive, geometry-consistent 3D content generation and can extend to joint optimization of geometry and texture.
Abstract
While recent 3D generative models can produce high-quality texture images, they often fail to capture human preferences or meet task-specific requirements. Moreover, a core challenge in the 3D texture generation domain is that most existing approaches rely on repeated calls to 2D text-to-image generative models, which lack an inherent understanding of the 3D structure of the input 3D mesh object. To alleviate these issues, we propose an end-to-end differentiable, reinforcement-learning-free framework that embeds human feedback, expressed as differentiable reward functions, directly into the 3D texture synthesis pipeline. By back-propagating preference signals through both geometric and appearance modules of the proposed framework, our method generates textures that respect the 3D geometry structure and align with desired criteria. To demonstrate its versatility, we introduce three novel geometry-aware reward functions, which offer a more controllable and interpretable pathway for creating high-quality 3D content from natural language. By conducting qualitative, quantitative, and user-preference evaluations against state-of-the-art methods, we demonstrate that our proposed strategy consistently outperforms existing approaches. Our implementation code is publicly available at: https://github.com/AHHHZ975/Differentiable-Texture-Learning
