Table of Contents
Fetching ...

Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired Data

Georgy Perevozchikov, Nancy Mehta, Egor Ershov, Radu Timofte

TL;DR

This work tackles the reliance on costly paired raw-to-sRGB data by introducing Experts-Guided Unbalanced Optimal Transport (EGUOT) for ISP learning. EGUOT couples Unbalanced OT with a Committee of Expert Discriminators to guide a transport-based raw-to-sRGB mapping, enabling effective training in both unpaired and paired settings and offering robustness to dataset outliers. Across three diverse datasets, EGUOT achieves state-of-the-art unpaired performance and consistently improves or matches paired baselines, demonstrating architecture-agnostic adaptability. Ablation studies confirm the necessity of the UOT objective and the expert committee, highlighting improved color fidelity, texture, and artifact suppression, with practical implications for scalable ISP development.

Abstract

Learned Image Signal Processing (ISP) pipelines offer powerful end-to-end performance but are critically dependent on large-scale paired raw-to-sRGB datasets. This reliance on costly-to-acquire paired data remains a significant bottleneck. To address this challenge, we introduce a novel, unsupervised training framework based on Optimal Transport capable of training arbitrary ISP architectures in both unpaired and paired modes. We are the first to successfully apply Unbalanced Optimal Transport (UOT) for this complex, cross-domain translation task. Our UOT-based framework provides robustness to outliers in the target sRGB data, allowing it to discount atypical samples that would be prohibitively costly to map. A key component of our framework is a novel ``committee of expert discriminators,'' a hybrid adversarial regularizer. This committee guides the optimal transport mapping by providing specialized, targeted gradients to correct specific ISP failure modes, including color fidelity, structural artifacts, and frequency-domain realism. To demonstrate the superiority of our approach, we retrained existing state-of-the-art ISP architectures using our paired and unpaired setups. Our experiments show that while our framework, when trained in paired mode, exceeds the performance of the original paired methods across all metrics, our unpaired mode concurrently achieves quantitative and qualitative performance that rivals, and in some cases surpasses, the original paired-trained counterparts. The code and pre-trained models are available at: https://github.com/gosha20777/EGUOT-ISP.git.

Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired Data

TL;DR

This work tackles the reliance on costly paired raw-to-sRGB data by introducing Experts-Guided Unbalanced Optimal Transport (EGUOT) for ISP learning. EGUOT couples Unbalanced OT with a Committee of Expert Discriminators to guide a transport-based raw-to-sRGB mapping, enabling effective training in both unpaired and paired settings and offering robustness to dataset outliers. Across three diverse datasets, EGUOT achieves state-of-the-art unpaired performance and consistently improves or matches paired baselines, demonstrating architecture-agnostic adaptability. Ablation studies confirm the necessity of the UOT objective and the expert committee, highlighting improved color fidelity, texture, and artifact suppression, with practical implications for scalable ISP development.

Abstract

Learned Image Signal Processing (ISP) pipelines offer powerful end-to-end performance but are critically dependent on large-scale paired raw-to-sRGB datasets. This reliance on costly-to-acquire paired data remains a significant bottleneck. To address this challenge, we introduce a novel, unsupervised training framework based on Optimal Transport capable of training arbitrary ISP architectures in both unpaired and paired modes. We are the first to successfully apply Unbalanced Optimal Transport (UOT) for this complex, cross-domain translation task. Our UOT-based framework provides robustness to outliers in the target sRGB data, allowing it to discount atypical samples that would be prohibitively costly to map. A key component of our framework is a novel ``committee of expert discriminators,'' a hybrid adversarial regularizer. This committee guides the optimal transport mapping by providing specialized, targeted gradients to correct specific ISP failure modes, including color fidelity, structural artifacts, and frequency-domain realism. To demonstrate the superiority of our approach, we retrained existing state-of-the-art ISP architectures using our paired and unpaired setups. Our experiments show that while our framework, when trained in paired mode, exceeds the performance of the original paired methods across all metrics, our unpaired mode concurrently achieves quantitative and qualitative performance that rivals, and in some cases surpasses, the original paired-trained counterparts. The code and pre-trained models are available at: https://github.com/gosha20777/EGUOT-ISP.git.

Paper Structure

This paper contains 25 sections, 6 equations, 11 figures, 9 tables, 1 algorithm.

Figures (11)

  • Figure 1: A conceptual overview of our framework's robustness to outliers. Top: Standard GANs (RawFormer perevozchikov2024rawformer), when trained on unpaired datasets, are highly sensitive to data outliers (e.g., overexposed photos). This leads to training instability and a collapsed output. Bottom: Our proposed EGUOT framework, leveraging Experts Guided Unbalanced Optimal Transport, is inherently robust to such data outliers and focuses on the high-quality core of the target distribution to produce perceptually better results.
  • Figure 2: An overview of our EGUOT training framework. (a) The main pipeline, where the generator ($T_\theta$) is trained by three parallel objectives: (1) a content-preserving cost function ($c(\cdot,\cdot)$) for paired and unpaired mode, (2) a robust Potential network ($P_\omega$), and (3) an Experts committee ($D_\psi$). (b) The detailed breakdown of the Experts committee, showing its three specialized discriminators for Color, Structure, and Frequency, each analyzing different features of the real ($Y$) and predicted ($\hat{Y}$) sRGB images.
  • Figure 3: Our framework's unpaired mode achieves the paired performance across diverse architectures on the Zurich raw-to-sRGB dataset ignatov2020replacing. The top row (a) shows the results of SOTA ISP backbones (e.g., Restormer perevozchikov2024rawformer, cmKAN nikonorov2025color) trained with their original paired settings. The bottom row (b) --- results of the same backbones trained in unpaired mode, highlighting the effectiveness of our method. Best viewed in the electronic version.
  • Figure 4: Quantitative results of unpaired frameworks on the Zurich raw-to-sRGB dataset ignatov2020replacing. We compare against modern GAN-based (e.g., RawFormer perevozchikov2024rawformer, UVCGANv2 torbunov2023rethinking), other OT-based (e.g., NOT korotin2022neural, UOTM choi2023generative), and recent unpaired ISP methods (LLUISP arhire2025learned). Our framework sets a new state-of-the-art performance across all metrics. Best viewed in the electronic version.
  • Figure 5: Quantitative results on the on the ISPIW raw-to-sRGB dataset shekhar2022transform, demonstrating the architecture-agnostic nature of our EGUOT framework. We apply (a) the original, paired training and (b) our proposed unpaired training to a wide range of SOTA ISP backbones. Our unpaired mode achieves performance that is highly competitive with the original settings. Best viewed in the electronic version.
  • ...and 6 more figures