3DOT: Texture Transfer for 3DGS Objects from a Single Reference Image
Xiao Cao, Beibei Lin, Bo Wang, Zhiyong Huang, Robby T. Tan
TL;DR
3DOT addresses the challenge of transferring texture from a single 2D image to a fixed 3D Gaussian Splatting object by introducing a progressive generation pipeline, view-consistency gradient guidance, and prompt-tuning-based gradient guidance. The method propagates edits from a reference view to neighboring views, enforces cross-view coherence through gradient-guided diffusion with cross-attention and text cues, and preserves texture characteristics by learning a texture-difference token aligned via CLIP space. Evaluations on 360-degree and face-forwarding datasets show state-of-the-art performance in both qualitative and quantitative metrics, with favorable editing speed compared to NeRF-based approaches. The work demonstrates robust texture identity preservation and view-consistent edits across unseen viewpoints, offering a practical, efficient solution for texture transfer in 3D editing scenarios.
Abstract
3D texture swapping allows for the customization of 3D object textures, enabling efficient and versatile visual transformations in 3D editing. While no dedicated method exists, adapted 2D editing and text-driven 3D editing approaches can serve this purpose. However, 2D editing requires frame-by-frame manipulation, causing inconsistencies across views, while text-driven 3D editing struggles to preserve texture characteristics from reference images. To tackle these challenges, we introduce 3DSwapping, a 3D texture swapping method that integrates: 1) progressive generation, 2) view-consistency gradient guidance, and 3) prompt-tuned gradient guidance. To ensure view consistency, our progressive generation process starts by editing a single reference image and gradually propagates the edits to adjacent views. Our view-consistency gradient guidance further reinforces consistency by conditioning the generation model on feature differences between consistent and inconsistent outputs. To preserve texture characteristics, we introduce prompt-tuning-based gradient guidance, which learns a token that precisely captures the difference between the reference image and the 3D object. This token then guides the editing process, ensuring more consistent texture preservation across views. Overall, 3DSwapping integrates these novel strategies to achieve higher-fidelity texture transfer while preserving structural coherence across multiple viewpoints. Extensive qualitative and quantitative evaluations confirm that our three novel components enable convincing and effective 2D texture swapping for 3D objects. Code will be available upon acceptance.
