Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation

Huiyu Zhai; Mo Chen; Xingxing Yang; Gusheng Kang

Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation

Huiyu Zhai, Mo Chen, Xingxing Yang, Gusheng Kang

TL;DR

This work addresses the challenging NIR-to-RGB translation problem, where spectral gaps cause mapping ambiguities that threaten texture fidelity and color diversity. It introduces Multi-scale HSV Color Feature Embedding Network (MCFNet), which decomposes the task into three sub-tasks—NIR texture maintenance, coarse geometry reconstruction, and RGB color prediction—via Texture Preserving Block, HSV Color Feature Embedding Module, and Geometry Reconstruction Module, respectively, with a multi-scale fusion strategy. The method leverages HSV color space guidance and Laplacian-based texture features, integrated through SPADE fusion, and optimizes with a loss suite including GAN, pair-consistent, cycle-consistent, and edge losses. Experimental results on the VCIP dataset show substantial improvements over state-of-the-art NIR colorization methods in PSNR, AE, and LPIPS, while ablations confirm the critical role of texture fusion, multiscale color guidance, and HSV-CFEM. The approach provides a practical, high-fidelity pipeline for NIR-to-RGB spectrum translation with improved texture preservation and color accuracy, and the authors release code for reproducibility.

Abstract

The NIR-to-RGB spectral domain translation is a formidable task due to the inherent spectral mapping ambiguities within NIR inputs and RGB outputs. Thus, existing methods fail to reconcile the tension between maintaining texture detail fidelity and achieving diverse color variations. In this paper, we propose a Multi-scale HSV Color Feature Embedding Network (MCFNet) that decomposes the mapping process into three sub-tasks, including NIR texture maintenance, coarse geometry reconstruction, and RGB color prediction. Thus, we propose three key modules for each corresponding sub-task: the Texture Preserving Block (TPB), the HSV Color Feature Embedding Module (HSV-CFEM), and the Geometry Reconstruction Module (GRM). These modules contribute to our MCFNet methodically tackling spectral translation through a series of escalating resolutions, progressively enriching images with color and texture fidelity in a scale-coherent fashion. The proposed MCFNet demonstrates substantial performance gains over the NIR image colorization task. Code is released at: https://github.com/AlexYangxx/MCFNet.

Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation

TL;DR

Abstract

Paper Structure (14 sections, 5 equations, 6 figures, 2 tables)

This paper contains 14 sections, 5 equations, 6 figures, 2 tables.

INTRODUCTION
RELATED WORK
GAN-based Methods
Transfer Learning-based Methods
PROPOSED METHOD
Texture Preserving Block
HSV Color Feature Embedding Module
Geometry Reconstruction Module
Objectives
EXPERIMENTS AND ANALYSIS
Implementation Details
Comparison Experiments
Ablation Experiments
CONCLUSION

Figures (6)

Figure 1: Mapping relations between NIR images and RGB images. In (a), the sofa within the same intensity in the NIR domain has different colors in the RGB domain. In (b), the sky within different intensities in the NIR domain has the same color in the RGB domain.
Figure 2: The whole framework is trained in a CycleGAN style Conventional_CycleGAN. The colorization network $C_A$ is responsible for generating colorized NIR images, while the generator $G_B$ is used to restore colorized images to NIR images.
Figure 3: Our MCFNet consists of three branches: Texture Preservation Block (TPB), HSV Color Feature Embedding Module (HSV-CFEM), and Geometry Reconstruction Module (GRM). TPB extracts the texture map $y_{tex}$ of the NIR input. $G_C$ (HSV-CFEM) generates multiple scales of color feature embeddings $F_{color}$ and the color map $y_{hsv}$. All $F_{color}$ will be adaptively injected into the corresponding scale of $G_A$ (GRM) through the SPADE module to serve as color guidance for geometry feature reconstruction. $G_A$ (GRM) reconstructs the geometry map ${y}^{\prime}_{rgb}$ at a coarse level. Finally, the texture map $y_{tex}$, the color map $y_{hsv}$ and the geometry map ${y}^{\prime}_{rgb}$ will be effectively fused via the fusion module.
Figure 4: HSV-CFEM is a color space learning approach based on the DCGAN radford2015unsupervised architecture. $x_{nir}$ is extended to a three channel image and converted to the HSV image form $y_{hsv}$ as input to generator $G_C$ (HSV-CFEM), with the goal of generating HSV images close to the true value $x_{hsv}$.
Figure 5: Visual comparison of NIR colorization methods. Images are from the VCIP dataset yang2023cooperative. The images generated by our network are closest to ground truth in terms of color and chromaticity. In terms of texture features, such as the fourth and fifth lines, images generated by other methods may have unnatural transitions at the boundary between objects due to the loss of the original texture. Our method effectively reduces this problem. Overall, they outperform other methods in visual appearance.
...and 1 more figures

Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation

TL;DR

Abstract

Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)