Table of Contents
Fetching ...

ColorFLUX: A Structure-Color Decoupling Framework for Old Photo Colorization

Bingchen Li, Zhixin Wang, Fan Li, Jiaqi Xu, Jiaming Guo, Renjing Pei, Xin Li, Zhibo Chen

Abstract

Old photos preserve invaluable historical memories, making their restoration and colorization highly desirable. While existing restoration models can address some degradation issues like denoising and scratch removal, they often struggle with accurate colorization. This limitation arises from the unique degradation inherent in old photos, such as faded brightness and altered color hues, which are different from modern photo distributions, creating a substantial domain gap during colorization. In this paper, we propose a novel old photo colorization framework based on the generative diffusion model FLUX. Our approach introduces a structure-color decoupling strategy that separates structure preservation from color restoration, enabling accurate colorization of old photos while maintaining structural consistency. We further enhance the model with a progressive Direct Preference Optimization (Pro-DPO) strategy, which allows the model to learn subtle color preferences through coarse-to-fine transitions in color augmentation. Additionally, we address the limitations of text-based prompts by introducing visual semantic prompts, which extract fine-grained semantic information directly from old photos, helping to eliminate the color bias inherent in old photos. Experimental results on both synthetic and real datasets demonstrate that our approach outperforms existing state-of-the-art colorization methods, including closed-source commercial models, producing high-quality and vivid colorization.

ColorFLUX: A Structure-Color Decoupling Framework for Old Photo Colorization

Abstract

Old photos preserve invaluable historical memories, making their restoration and colorization highly desirable. While existing restoration models can address some degradation issues like denoising and scratch removal, they often struggle with accurate colorization. This limitation arises from the unique degradation inherent in old photos, such as faded brightness and altered color hues, which are different from modern photo distributions, creating a substantial domain gap during colorization. In this paper, we propose a novel old photo colorization framework based on the generative diffusion model FLUX. Our approach introduces a structure-color decoupling strategy that separates structure preservation from color restoration, enabling accurate colorization of old photos while maintaining structural consistency. We further enhance the model with a progressive Direct Preference Optimization (Pro-DPO) strategy, which allows the model to learn subtle color preferences through coarse-to-fine transitions in color augmentation. Additionally, we address the limitations of text-based prompts by introducing visual semantic prompts, which extract fine-grained semantic information directly from old photos, helping to eliminate the color bias inherent in old photos. Experimental results on both synthetic and real datasets demonstrate that our approach outperforms existing state-of-the-art colorization methods, including closed-source commercial models, producing high-quality and vivid colorization.

Paper Structure

This paper contains 38 sections, 6 equations, 18 figures, 4 tables.

Figures (18)

  • Figure 1: ColorFLUX achieves vivid and realistic colorization of old photos by decoupling structure and color restoration, exhibiting strong generalization across diverse scenarios, including portraits, group photos, and landscapes.
  • Figure 2: llustration of real old photo’s fading effect. ColorFLUX brings them back to life (second row).
  • Figure 3: The framework of our proposed method. (a) The inference process of colorization, including a preprocessor to remove degradations in old photos. (b) The training procedure of our method. Notably, we illustrate an example of easy-to-hard losing samples. The model progressively learns preferences from severely augmented samples to milder ones, ensuring precise color perception to effectively counteract the fading effects of old photos. (c) The preference pair collection pipeline. To synthesize $x^l$, we randomly apply augmentation combinations sampled from ['B', 'C', 'S'].
  • Figure 4: Qualitative comparisons between ColorFLUX and other methods on RealOldPhotos. Zoom in for better views.
  • Figure 5: Visual results of decoupling each parts of ColorFLUX. We denote structure consistency training, basic color learning, and fine-color adjustment by SCT, BCL, and FCA, respectively. "W/o" represents that we remove the according network from the trained framework (e.g., w/o SCT denotes for removing ControlNet).
  • ...and 13 more figures