DreamColour: Controllable Video Colour Editing without Training
Chaitat Utintu, Pinaki Nath Chowdhury, Aneeshan Sain, Subhadeep Koley, Ayan Kumar Bhunia, Yi-Zhe Song
TL;DR
DreamColour tackles the challenge of training‑free, temporally coherent video colour editing by decoupling spatial colour edits from temporal propagation. It combines a grid‑based, instance‑aware intra‑frame editing stage with bidirectional diffusion priors and spatio‑temporal feature injection to propagate edits across frames without retraining. Key contributions include a SAM2‑guided UniColor masking for precise region control, DDIM inversion with BLIP‑2 semantics, and a forward‑backward propagation framework that yields smooth, artifact‑free colour transitions in complex scenes. The approach delivers professional‑quality results on diverse videos using only pre‑trained components, enabling accessible colour editing without specialised hardware. The documented ablations and comparisons show improved boundary fidelity, temporal consistency, and qualitative appeal relative to zero‑shot baselines.
Abstract
Video colour editing is a crucial task for content creation, yet existing solutions either require painstaking frame-by-frame manipulation or produce unrealistic results with temporal artefacts. We present a practical, training-free framework that makes precise video colour editing accessible through an intuitive interface while maintaining professional-quality output. Our key insight is that by decoupling spatial and temporal aspects of colour editing, we can better align with users' natural workflow -- allowing them to focus on precise colour selection in key frames before automatically propagating changes across time. We achieve this through a novel technical framework that combines: (i) a simple point-and-click interface merging grid-based colour selection with automatic instance segmentation for precise spatial control, (ii) bidirectional colour propagation that leverages inherent video motion patterns, and (iii) motion-aware blending that ensures smooth transitions even with complex object movements. Through extensive evaluation on diverse scenarios, we demonstrate that our approach matches or exceeds state-of-the-art methods while eliminating the need for training or specialized hardware, making professional-quality video colour editing accessible to everyone.
