DragTex: Generative Point-Based Texture Editing on 3D Mesh
Yudi Zhang, Qi Xu, Lei Zhang
TL;DR
DragTex addresses the challenge of editing textures directly on 3D meshes with precise spatial control. It introduces a diffusion-based pipeline that blends noisy latent edits locally near silhouettes across views to ensure cross-view consistency, while fine-tuning the decoder to preserve detail in non-drag regions. A key contribution is pre-training LoRA on multi-view images, markedly reducing per-edit training time and enabling efficient interactive editing. Additional robustness comes from static control points to preserve geometry. Together, these components enable plausible, drag-guided texture edits that align with user intent and maintain multi-view coherence, with demonstrated improvements over per-view training and artifact-prone baselines.
Abstract
Creating 3D textured meshes using generative artificial intelligence has garnered significant attention recently. While existing methods support text-based generative texture generation or editing on 3D meshes, they often struggle to precisely control pixels of texture images through more intuitive interaction. While 2D images can be edited generatively using drag interaction, applying this type of methods directly to 3D mesh textures still leads to issues such as the lack of local consistency among multiple views, error accumulation and long training times. To address these challenges, we propose a generative point-based 3D mesh texture editing method called DragTex. This method utilizes a diffusion model to blend locally inconsistent textures in the region near the deformed silhouette between different views, enabling locally consistent texture editing. Besides, we fine-tune a decoder to reduce reconstruction errors in the non-drag region, thereby mitigating overall error accumulation. Moreover, we train LoRA using multi-view images instead of training each view individually, which significantly shortens the training time. The experimental results show that our method effectively achieves dragging textures on 3D meshes and generates plausible textures that align with the desired intent of drag interaction.
