Textured 3D Regenerative Morphing with 3D Diffusion Prior
Songlin Yang, Yushi Lan, Honghua Chen, Xingang Pan
TL;DR
The paper tackles textured 3D morphing across diverse object categories without relying on explicit point-to-point correspondences. It introduces a regenerative morphing pipeline built on a generic 3D diffusion prior (Gaussian Anything), interpolating source/target information at three levels (initial noises, LoRA model parameters, and CLIP-conditioned features) and refining results through Attention Fusion, Token Reordering, and Low-Frequency Enhancement. The authors demonstrate superior smoothness, plausibility, and cross-category generalization compared with state-of-the-art baselines, including 3D-aware multi-view approaches, while providing extensive ablations to justify the proposed strategies. This approach enables scalable, texture-preserving 3D morphing suitable for visual effects and creative design, reducing the need for laborious alignment and specialized datasets. Future work points to higher-fidelity 3D priors and temporal consistency for extending to more complex 4D content.
Abstract
Textured 3D morphing creates smooth and plausible interpolation sequences between two 3D objects, focusing on transitions in both shape and texture. This is important for creative applications like visual effects in filmmaking. Previous methods rely on establishing point-to-point correspondences and determining smooth deformation trajectories, which inherently restrict them to shape-only morphing on untextured, topologically aligned datasets. This restriction leads to labor-intensive preprocessing and poor generalization. To overcome these challenges, we propose a method for 3D regenerative morphing using a 3D diffusion prior. Unlike previous methods that depend on explicit correspondences and deformations, our method eliminates the additional need for obtaining correspondence and uses the 3D diffusion prior to generate morphing. Specifically, we introduce a 3D diffusion model and interpolate the source and target information at three levels: initial noise, model parameters, and condition features. We then explore an Attention Fusion strategy to generate more smooth morphing sequences. To further improve the plausibility of semantic interpolation and the generated 3D surfaces, we propose two strategies: (a) Token Reordering, where we match approximate tokens based on semantic analysis to guide implicit correspondences in the denoising process of the diffusion model, and (b) Low-Frequency Enhancement, where we enhance low-frequency signals in the tokens to improve the quality of generated surfaces. Experimental results show that our method achieves superior smoothness and plausibility in 3D morphing across diverse cross-category object pairs, offering a novel regenerative method for 3D morphing with textured representations.
