Table of Contents
Fetching ...

MultiColor: Image Colorization by Learning from Multiple Color Spaces

Xiangcheng Du, Zhao Zhou, Yanlong Wang, Zhuoyao Wang, Yingbin Zheng, Cheng Jin

TL;DR

This work tackles the limitation of single color spaces in image colorization by proposing MultiColor, a framework that learns color channels across multiple color spaces and fuses them with a dedicated Color Space Complementary Network. Each color space is modeled with its own transformer-based color queries and color mapper, enabling space-specific color reasoning, while CSCNet integrates the space-specific predictions into a coherent final color image in an end-to-end fashion. The approach yields state-of-the-art results on ImageNet, COCO-Stuff, and ADE20K, with significant gains in FID and colorfulness metrics, and demonstrates strong generalization without additional fine-tuning. Overall, MultiColor broadens the color representation basis for colorization, improving realism, color diversity, and robustness across datasets.

Abstract

Deep networks have shown impressive performance in the image restoration tasks, such as image colorization. However, we find that previous approaches rely on the digital representation from single color model with a specific mapping function, a.k.a., color space, during the colorization pipeline. In this paper, we first investigate the modeling of different color spaces, and find each of them exhibiting distinctive characteristics with unique distribution of colors. The complementarity among multiple color spaces leads to benefits for the image colorization task. We present MultiColor, a new learning-based approach to automatically colorize grayscale images that combines clues from multiple color spaces. Specifically, we employ a set of dedicated colorization modules for individual color space. Within each module, a transformer decoder is first employed to refine color query embeddings and then a color mapper produces color channel prediction using the embeddings and semantic features. With these predicted color channels representing various color spaces, a complementary network is designed to exploit the complementarity and generate pleasing and reasonable colorized images. We conduct extensive experiments on real-world datasets, and the results demonstrate superior performance over the state-of-the-arts.

MultiColor: Image Colorization by Learning from Multiple Color Spaces

TL;DR

This work tackles the limitation of single color spaces in image colorization by proposing MultiColor, a framework that learns color channels across multiple color spaces and fuses them with a dedicated Color Space Complementary Network. Each color space is modeled with its own transformer-based color queries and color mapper, enabling space-specific color reasoning, while CSCNet integrates the space-specific predictions into a coherent final color image in an end-to-end fashion. The approach yields state-of-the-art results on ImageNet, COCO-Stuff, and ADE20K, with significant gains in FID and colorfulness metrics, and demonstrates strong generalization without additional fine-tuning. Overall, MultiColor broadens the color representation basis for colorization, improving realism, color diversity, and robustness across datasets.

Abstract

Deep networks have shown impressive performance in the image restoration tasks, such as image colorization. However, we find that previous approaches rely on the digital representation from single color model with a specific mapping function, a.k.a., color space, during the colorization pipeline. In this paper, we first investigate the modeling of different color spaces, and find each of them exhibiting distinctive characteristics with unique distribution of colors. The complementarity among multiple color spaces leads to benefits for the image colorization task. We present MultiColor, a new learning-based approach to automatically colorize grayscale images that combines clues from multiple color spaces. Specifically, we employ a set of dedicated colorization modules for individual color space. Within each module, a transformer decoder is first employed to refine color query embeddings and then a color mapper produces color channel prediction using the embeddings and semantic features. With these predicted color channels representing various color spaces, a complementary network is designed to exploit the complementarity and generate pleasing and reasonable colorized images. We conduct extensive experiments on real-world datasets, and the results demonstrate superior performance over the state-of-the-arts.
Paper Structure (15 sections, 10 equations, 7 figures, 7 tables)

This paper contains 15 sections, 10 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Color gamut of different color spaces at a specific pixel with grayscale value 102. The corresponding values of L, V and Y channels are 42, 0.57, and 0.4 in the respective color space. The pentagram indicates where the color value of this pixel in groundtruth color image appears in other color spaces.
  • Figure 2: The architecture of the proposed framework. Given a grayscale image, multi-scale semantic features are obtained with the encoder. Multiple modeling color space operations can produce various color channels of different color spaces. For each colorization module, transformer decoder refines learnable color queries based on the multi-scale features, and color mapper aims to generate color channels of multiple color spaces. Finally, the Color Space Complementary Network (CSCNet) is introduced to transform the multiple color channels into colorized images.
  • Figure 3: The structure of color mapper.
  • Figure 4: Visual comparison of competing methods on automatic image colorization.
  • Figure 5: Visualization results by learning from different color spaces. The numbers on top of each image indicate CF / $\Delta$CF.
  • ...and 2 more figures