Multiscale Sliced Wasserstein Distances as Perceptual Color Difference Measures
Jiaqi He, Zhihua Wang, Leon Wang, Tsein-I Liu, Yuming Fang, Qilin Sun, Kede Ma
TL;DR
The paper tackles the problem of perceptual color difference assessment under image misalignment, where traditional co-located CD measures fail to predict human judgments. It introduces MS-SWD, a training-free CD metric that compares non-local patch distributions across multiple scales by building Gaussian pyramids in the $CIELAB$ space and computing the sliced Wasserstein distance ($SWD$) at each scale before averaging over $K$ scales. MS-SWD is demonstrated on the SPCD dataset to outperform competing methods on non-perfectly aligned image pairs and to exhibit favorable metric properties, with additional validation as a loss function for image and video color transfer. The work emphasizes computational efficiency through random linear projections and a sorting-based correspondence mechanism, and discusses avenues for extending the approach with alternative pyramids and perceptual tasks. The code is publicly available, enabling adoption in research and applications requiring robust perceptual CD assessment.
Abstract
Contemporary color difference (CD) measures for photographic images typically operate by comparing co-located pixels, patches in a ``perceptually uniform'' color space, or features in a learned latent space. Consequently, these measures inadequately capture the human color perception of misaligned image pairs, which are prevalent in digital photography (e.g., the same scene captured by different smartphones). In this paper, we describe a perceptual CD measure based on the multiscale sliced Wasserstein distance, which facilitates efficient comparisons between non-local patches of similar color and structure. This aligns with the modern understanding of color perception, where color and structure are inextricably interdependent as a unitary process of perceptual organization. Meanwhile, our method is easy to implement and training-free. Experimental results indicate that our CD measure performs favorably in assessing CDs in photographic images, and consistently surpasses competing models in the presence of image misalignment. Additionally, we empirically verify that our measure functions as a metric in the mathematical sense, and show its promise as a loss function for image and video color transfer tasks. The code is available at https://github.com/real-hjq/MS-SWD.
