Table of Contents
Fetching ...

BaryIR: Learning Multi-Source Unified Representation in Continuous Barycenter Space for Generalizable All-in-One Image Restoration

Xiaole Tang, Xiaoyi He, Xiang Gu, Jian Sun

TL;DR

BaryIR tackles the all-in-one image restoration problem by learning a continuous, multi-source barycenter space that captures degradation-agnostic features, while simultaneously maintaining source-specific subspaces for degradation semantics. It formulates a multi-source latent OT barycenter objective (MLOT) with a neural-network parameterized barycenter map that transports source representations to the barycenter, aided by source-level contrastiveness and barycenter-anchored orthogonality. Through a maximin training regime on dual OT potentials and a conjoined decoder, BaryIR achieves strong generalization to real-world and unseen degradations, outperforming several state-of-the-art AIR methods. The approach provides theoretical error bounds for the learned barycenter map and demonstrates improved robustness across synthetic benchmarks and real-world datasets. This framework paves the way for continuous, geometry-aware unified representations in low-level vision and potentially multi-modal contexts.

Abstract

Despite remarkable advances made in all-in-one image restoration (AIR) for handling different types of degradations simultaneously, existing methods remain vulnerable to out-of-distribution degradations and images, limiting their real-world applicability. In this paper, we propose a multi-source representation learning framework BaryIR, which decomposes the latent space of multi-source degraded images into a continuous barycenter space for unified feature encoding and source-specific subspaces for specific semantic encoding. Specifically, we seek the multi-source unified representation by introducing a multi-source latent optimal transport barycenter problem, in which a continuous barycenter map is learned to transport the latent representations to the barycenter space. The transport cost is designed such that the representations from source-specific subspaces are contrasted with each other while maintaining orthogonality to those from the barycenter space. This enables BaryIR to learn compact representations with unified degradation-agnostic information from the barycenter space, as well as degradation-specific semantics from source-specific subspaces, capturing the inherent geometry of multi-source data manifold for generalizable AIR. Extensive experiments demonstrate that BaryIR achieves competitive performance compared to state-of-the-art all-in-one methods. Particularly, BaryIR exhibits superior generalization ability to real-world data and unseen degradations. The code will be publicly available at https://github.com/xl-tang3/BaryIR.

BaryIR: Learning Multi-Source Unified Representation in Continuous Barycenter Space for Generalizable All-in-One Image Restoration

TL;DR

BaryIR tackles the all-in-one image restoration problem by learning a continuous, multi-source barycenter space that captures degradation-agnostic features, while simultaneously maintaining source-specific subspaces for degradation semantics. It formulates a multi-source latent OT barycenter objective (MLOT) with a neural-network parameterized barycenter map that transports source representations to the barycenter, aided by source-level contrastiveness and barycenter-anchored orthogonality. Through a maximin training regime on dual OT potentials and a conjoined decoder, BaryIR achieves strong generalization to real-world and unseen degradations, outperforming several state-of-the-art AIR methods. The approach provides theoretical error bounds for the learned barycenter map and demonstrates improved robustness across synthetic benchmarks and real-world datasets. This framework paves the way for continuous, geometry-aware unified representations in low-level vision and potentially multi-modal contexts.

Abstract

Despite remarkable advances made in all-in-one image restoration (AIR) for handling different types of degradations simultaneously, existing methods remain vulnerable to out-of-distribution degradations and images, limiting their real-world applicability. In this paper, we propose a multi-source representation learning framework BaryIR, which decomposes the latent space of multi-source degraded images into a continuous barycenter space for unified feature encoding and source-specific subspaces for specific semantic encoding. Specifically, we seek the multi-source unified representation by introducing a multi-source latent optimal transport barycenter problem, in which a continuous barycenter map is learned to transport the latent representations to the barycenter space. The transport cost is designed such that the representations from source-specific subspaces are contrasted with each other while maintaining orthogonality to those from the barycenter space. This enables BaryIR to learn compact representations with unified degradation-agnostic information from the barycenter space, as well as degradation-specific semantics from source-specific subspaces, capturing the inherent geometry of multi-source data manifold for generalizable AIR. Extensive experiments demonstrate that BaryIR achieves competitive performance compared to state-of-the-art all-in-one methods. Particularly, BaryIR exhibits superior generalization ability to real-world data and unseen degradations. The code will be publicly available at https://github.com/xl-tang3/BaryIR.

Paper Structure

This paper contains 15 sections, 2 theorems, 17 equations, 5 figures, 5 tables.

Key Result

Theorem 4.1

The minimum objective value $\mathcal{L}^*$ of the MLOT barycenter problem (main:mlot-bary) can be expressed as

Figures (5)

  • Figure 1: BaryIR decomposes the latent space of multi-source degraded images into a continuous barycenter space and source-specific subspaces. The source-specific representations are contrasted with each other while remaining orthogonal to the barycenter ones. The barycenter space seeks to encode degradation-agnostic features by aggregating the multiple source domains, which enriches the overall geometry of the data manifold.
  • Figure 2: Overview of the proposed BaryIR framework. Based on the MLOT barycenter objective, we train the MLOT barycenter map that transports the latent representation to the barycenter space. Correspondingly, we can establish the source-specific subspaces with elements being differences between the sources and barycenters. By aggregating representations from both spaces, BaryIR can capture degradation-agnostic/specific semantics for all-in-one image restoration. The encoder and decoder adopts the Restormer Zamir2021Restormer architecture.
  • Figure 3: Visual examples of generalization evaluation with five-degradation models on unseen real-world O-HAZE ancuti2018haze and SPANet Wang_2019_CVPR.
  • Figure 4: Visual examples on unseen real-world mixed-degradation images. Row 1: haze and rain. Row 2: blur and noise.
  • Figure 5: The t-SNE visualization of different representations.

Theorems & Definitions (2)

  • Theorem 4.1: Dual reformulation for MLOT barycenter problem (\ref{['main:mlot-bary']})
  • Theorem 4.2: Error analysis via duality gaps for the recovered maps