Table of Contents
Fetching ...

Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization

Ming-Yang Ho, Che-Ming Wu, Min-Sheng Wu, Yufeng Jane Tseng

TL;DR

Ultra-high-resolution unpaired image-to-image translation is hampered by GPU memory limits that force patch-based processing, leading to tiling artifacts from patch-wise normalization. The paper introduces Dense Normalization (DN), a plug-in layer that estimates pixel-level statistical moments via a fast interpolation and a single-pass prefetching parallelism, enabling seamless dense normalization without retraining. DN reduces tiling and preserves local hue while delivering state-of-the-art performance on natural and pathological datasets, including stain transformation tasks in medical imaging. The work provides a fast interpolation algorithm, a caching-based single-pass pipeline, extensive evaluations, and releases the real2paint dataset to foster future research in UHR I2I translation.

Abstract

Recent advancements in ultra-high-resolution unpaired image-to-image translation have aimed to mitigate the constraints imposed by limited GPU memory through patch-wise inference. Nonetheless, existing methods often compromise between the reduction of noticeable tiling artifacts and the preservation of color and hue contrast, attributed to the reliance on global image- or patch-level statistics in the instance normalization layers. In this study, we introduce a Dense Normalization (DN) layer designed to estimate pixel-level statistical moments. This approach effectively diminishes tiling artifacts while concurrently preserving local color and hue contrasts. To address the computational demands of pixel-level estimation, we further propose an efficient interpolation algorithm. Moreover, we invent a parallelism strategy that enables the DN layer to operate in a single pass. Through extensive experiments, we demonstrate that our method surpasses all existing approaches in performance. Notably, our DN layer is hyperparameter-free and can be seamlessly integrated into most unpaired image-to-image translation frameworks without necessitating retraining. Overall, our work paves the way for future exploration in handling images of arbitrary resolutions within the realm of unpaired image-to-image translation. Code is available at: https://github.com/Kaminyou/Dense-Normalization.

Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization

TL;DR

Ultra-high-resolution unpaired image-to-image translation is hampered by GPU memory limits that force patch-based processing, leading to tiling artifacts from patch-wise normalization. The paper introduces Dense Normalization (DN), a plug-in layer that estimates pixel-level statistical moments via a fast interpolation and a single-pass prefetching parallelism, enabling seamless dense normalization without retraining. DN reduces tiling and preserves local hue while delivering state-of-the-art performance on natural and pathological datasets, including stain transformation tasks in medical imaging. The work provides a fast interpolation algorithm, a caching-based single-pass pipeline, extensive evaluations, and releases the real2paint dataset to foster future research in UHR I2I translation.

Abstract

Recent advancements in ultra-high-resolution unpaired image-to-image translation have aimed to mitigate the constraints imposed by limited GPU memory through patch-wise inference. Nonetheless, existing methods often compromise between the reduction of noticeable tiling artifacts and the preservation of color and hue contrast, attributed to the reliance on global image- or patch-level statistics in the instance normalization layers. In this study, we introduce a Dense Normalization (DN) layer designed to estimate pixel-level statistical moments. This approach effectively diminishes tiling artifacts while concurrently preserving local color and hue contrasts. To address the computational demands of pixel-level estimation, we further propose an efficient interpolation algorithm. Moreover, we invent a parallelism strategy that enables the DN layer to operate in a single pass. Through extensive experiments, we demonstrate that our method surpasses all existing approaches in performance. Notably, our DN layer is hyperparameter-free and can be seamlessly integrated into most unpaired image-to-image translation frameworks without necessitating retraining. Overall, our work paves the way for future exploration in handling images of arbitrary resolutions within the realm of unpaired image-to-image translation. Code is available at: https://github.com/Kaminyou/Dense-Normalization.
Paper Structure (15 sections, 11 equations, 7 figures, 4 tables)

This paper contains 15 sections, 11 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Comparison of translations. (a) Showcases a real2paint translated ultra-high-resolution image (3,024$\times$4,032 pixels) produced by our Dense Normalization (DN) from the image displayed in the top right corner, with comparisons highlighted within the blue-boxed region. (b) Illustrates the occurrence of gap-type tiling artifacts in patch-wise IN ulyanov2016instance or KIN ho2022ultra; (c) Demonstrates jitter-type tiling artifacts resulting from TIN chen2022towards; (d) Presents DN's effectiveness in diminishing tiling artifacts.
  • Figure 2: Comparison of various normalization strategies. This figure illustrates the framework and the impact of different normalization methods on an UHR image (3,024$\times$4,032 pixels) for the summer2autumn task: (a) Patch-wise IN ulyanov2016instance uses patch-level statistics and leads to statistical differences between patches, resulting in noticeable gap-type tiling artifacts. (b) TIN chen2022towards eliminates statistical differences with global image-level statistics (from the thumbnail) but compromises color and hue details, also inducing jitter-type tiling artifacts. (c) KIN ho2022ultra utilizes a two-stage pipeline to mitigate statistical differences by applying convolutional operations on patch-level statistics, albeit at the expense of local detail. (d) DN estimates pixel-level statistical moments in a single pass, effectively preserving local color and hue while diminishing tiling artifacts. (e) DN outperforms all methods in every aspect of human evaluation. In the row of features, ✓ indicates "achieved"; $\bigtriangleup$ indicates "partially achieved"; ✗ indicates "not achieved". Red close-up boxes highlight the outcomes influenced by different statistical moments used for normalization.
  • Figure 3: Framework of the Proposed Method. (a) Provides an overall view of our framework's pipeline. (b) Shows the details of the dispatcher and the Dense Normalization (DN) layer. A UHR image $\boldsymbol{X}$ is initially divided into patches $\boldsymbol{x}^{\text{patch}}_{r, c}$, with $r$ and $c$ representing the row and column coordinates, respectively. The dispatcher sequences two patches for the prefetching and inference branches. Within the DN layer, the prefetching branch calculates and caches statistical moments. For the inference branch, statistics for the patch and its eight surrounding patches are queried. Subsequently, fast interpolations are employed to estimate the mean ($\hat{\boldsymbol{\mu}}^{\text{pixel}}_{c,r}$) and standard deviation ($\hat{\boldsymbol{\sigma^*}}^{\text{pixel}}_{c,r}$) for each pixel, facilitating dense normalization.
  • Figure 4: Details of the fast interpolation operation utilized in DN. Panel (a) illustrates the process of deriving N$\times$N pixel-level statistical moment estimations from a 3$\times$3 matrix. Panel (b) visualizes the matrix multiplication operation involved in fast interpolation.
  • Figure 5: Comparison of two-stage and single-pass DN. A naïve implementation of DN might resemble KIN, operating in two stages. However, our dispatcher design and prefetching strategy enable the prefetching branch to run in parallel with the inference branch across most neural network (NN) layers, and to execute asynchronously in the DN layer, effectively hiding the runtime of the prefetching branch.
  • ...and 2 more figures