Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Jamie Wynn, Zawar Qureshi, Jakub Powierza, Jamie Watson, Mohamed Sayed
TL;DR
Morpheus tackles the challenge of text-driven stylization for 3D scenes by enabling geometry changes in addition to appearance. It introduces an autoregressive pipeline that stylizes frames via a dedicated RGBD diffusion model with independent appearance and depth strength controls, and propagates stylization across views with a Warp ControlNet and depth-informed feature sharing. The method retrains a 3D Gaussian Splatting (3DGS) model on the stylized frames, achieving improved multi-view consistency and more striking geometry alterations compared to prior work. Quantitative metrics and a user study demonstrate superior adherence to prompts and higher aesthetic quality, highlighting the practical impact for data-efficient 3D stylization and downstream tasks with limited training data.
Abstract
Exploring real-world spaces using novel-view synthesis is fun, and reimagining those worlds in a different style adds another layer of excitement. Stylized worlds can also be used for downstream tasks where there is limited training data and a need to expand a model's training distribution. Most current novel-view synthesis stylization techniques lack the ability to convincingly change geometry. This is because any geometry change requires increased style strength which is often capped for stylization stability and consistency. In this work, we propose a new autoregressive 3D Gaussian Splatting stylization method. As part of this method, we contribute a new RGBD diffusion model that allows for strength control over appearance and shape stylization. To ensure consistency across stylized frames, we use a combination of novel depth-guided cross attention, feature injection, and a Warp ControlNet conditioned on composite frames for guiding the stylization of new frames. We validate our method via extensive qualitative results, quantitative experiments, and a user study. Code online.
