High-Resolution Image Translation Model Based on Grayscale Redefinition
Xixian Wu, Dian Chao, Yang Yang
TL;DR
This work tackles the challenge of cross-domain, high-resolution image translation, introducing a grayscale redefinition strategy to achieve pixel-level translation quality. It combines a loss-enhanced Pix2pixHD framework with a grayscale channel pipeline and grayscale density control, while leveraging pretraining on SAR2EO to cope with data scarcity across SAR-to-EO, RGB-to-IR, and SAR-to-IR tasks. Key contributions include the introduction of Loss_all with LPIPS and L2 terms, a grayscale conversion and density-reconstruction module, and task-specific module combinations that yield competitive PBVS 2024 results, notably achieving a final score of $0.32$. The approach demonstrates practical gains in IR-synthesis from RGB and SAR inputs and highlights the value of cross-task pretraining and grayscale-based refinements for high-resolution image translation.
Abstract
Image-to-image translation is a technique that focuses on transferring images from one domain to another while maintaining the essential content representations. In recent years, image-to-image translation has gained significant attention and achieved remarkable advancements due to its diverse applications in computer vision and image processing tasks. In this work, we propose an innovative method for image translation between different domains. For high-resolution image translation tasks, we use a grayscale adjustment method to achieve pixel-level translation. For other tasks, we utilize the Pix2PixHD model with a coarse-to-fine generator, multi-scale discriminator, and improved loss to enhance the image translation performance. On the other hand, to tackle the issue of sparse training data, we adopt model weight initialization from other task to optimize the performance of the current task.
