Table of Contents
Fetching ...

Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport

Milena Gazdieva, Jaemoo Choi, Alexander Kolesov, Jaewoong Choi, Petr Mokrov, Alexander Korotin

TL;DR

This work addresses robust aggregation of multiple source distributions by proposing a continuous semi-unbalanced OT (SUOT) barycenter framework. It derives a scalable max–min dual formulation, enforces an $m$-congruence condition on potentials, and parameterizes the solution with neural networks to handle high-dimensional data. The method demonstrates robustness to outliers and class imbalance, with theoretical guarantees on the recovered plans and practical validation on synthetic and image-like tasks, including a ShapeGAN-style experiment. By operating in the continuous, data-driven setting and providing rejection-based inference, it offers a flexible and scalable tool for robust data fusion across heterogeneous sources. The accompanying code further enables reproducibility and application to real-world problems demanding robust distribution averaging.

Abstract

Aggregating data from multiple sources can be formalized as an Optimal Transport (OT) barycenter problem, which seeks to compute the average of probability distributions with respect to OT discrepancies. However, in real-world scenarios, the presence of outliers and noise in the data measures can significantly hinder the performance of traditional statistical methods for estimating OT barycenters. To address this issue, we propose a novel scalable approach for estimating the robust continuous barycenter, leveraging the dual formulation of the (semi-)unbalanced OT problem. To the best of our knowledge, this paper is the first attempt to develop an algorithm for robust barycenters under the continuous distribution setup. Our method is framed as a min-max optimization problem and is adaptable to general cost functions. We rigorously establish the theoretical underpinnings of the proposed method and demonstrate its robustness to outliers and class imbalance through a number of illustrative experiments. Our source code is publicly available at https://github.com/milenagazdieva/U-NOTBarycenters.

Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport

TL;DR

This work addresses robust aggregation of multiple source distributions by proposing a continuous semi-unbalanced OT (SUOT) barycenter framework. It derives a scalable max–min dual formulation, enforces an -congruence condition on potentials, and parameterizes the solution with neural networks to handle high-dimensional data. The method demonstrates robustness to outliers and class imbalance, with theoretical guarantees on the recovered plans and practical validation on synthetic and image-like tasks, including a ShapeGAN-style experiment. By operating in the continuous, data-driven setting and providing rejection-based inference, it offers a flexible and scalable tool for robust data fusion across heterogeneous sources. The accompanying code further enables reproducibility and application to real-world problems demanding robust distribution averaging.

Abstract

Aggregating data from multiple sources can be formalized as an Optimal Transport (OT) barycenter problem, which seeks to compute the average of probability distributions with respect to OT discrepancies. However, in real-world scenarios, the presence of outliers and noise in the data measures can significantly hinder the performance of traditional statistical methods for estimating OT barycenters. To address this issue, we propose a novel scalable approach for estimating the robust continuous barycenter, leveraging the dual formulation of the (semi-)unbalanced OT problem. To the best of our knowledge, this paper is the first attempt to develop an algorithm for robust barycenters under the continuous distribution setup. Our method is framed as a min-max optimization problem and is adaptable to general cost functions. We rigorously establish the theoretical underpinnings of the proposed method and demonstrate its robustness to outliers and class imbalance through a number of illustrative experiments. Our source code is publicly available at https://github.com/milenagazdieva/U-NOTBarycenters.
Paper Structure (27 sections, 6 theorems, 57 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 27 sections, 6 theorems, 57 equations, 7 figures, 8 tables, 1 algorithm.

Key Result

Theorem 1

The dual form of SUOT barycenter problem eq:primal-suot-bary is given by

Figures (7)

  • Figure 1: The semi-unbalanced barycenter of distributions of colors ($\mathbb{P}_1$) and digits ($\mathbb{P}_2$) computed by our U-NOTB solver in the latent space of a StyleGAN model pretrained on colored MNIST images of digits '2', '3'. Our solver allows for successful elimination of outliers in the input distributions.
  • Figure 2: Transport map $T_1(x_1),\;x_1\sim\mathbb{P}_1$ and UOT barycenter distribution $\mathbb{Q} = {T_1}_\# \mathbb{P}_1$ obtained by Discrete UOT, UOTM, Mini-batch UOT, and our method in Spiral$\rightarrow$Gaussian Mixture.
  • Figure 3: Conditional plans $\gamma_k(y|x)$ and barycenter $\mathbb{Q}$ obtained by NOTB and our method in Gaussian Mixture barycenter experiment. We evaluate on various unbalancedness parameter $\tau \in \{ 1,20,200 \}$.
  • Figure 4: Learned barycenter $\mathbb{Q}$ obtained from $x\sim \mathbb{P}_1$ by NOTB and our method on Gaussian Mixture with 5% outliers. The evaluation is conducted for various unbalancedness parameter $\tau \in \{ 1,20,200 \}$. For comparison, we additionally trained NOTB without outliers, as shown in subfigure (c).
  • Figure 5: Examples of $x_1 \sim \mathbb{P}_1$ (grayscale digits) and $x_2 \sim \mathbb{P}_2$ (color images) and its corresponding barycenter $y\sim \mathbb{Q}$ samples (colored-digits) in shape-color experiment.
  • ...and 2 more figures

Theorems & Definitions (11)

  • Theorem 1: Semi-dual form of SUOT barycenter problem
  • Corollary 1: Maximin reformulation for the semi-dual problem \ref{['eq:dual-suot-bary-before']}
  • Corollary 2: Congruence Condition of the Special SUOT barycenter problem
  • Theorem 2: SUOT barycenter conditional plans are contained in optimal saddle points
  • Theorem 3: Quality bounds for recovered plans
  • proof : Proof of Theorem \ref{['thm-dual-suot-bary']}
  • proof : Proof of Corollary \ref{['corr-dual-suot-bary']}
  • proof : Proof of Corollary \ref{['corr-congruence']}
  • Theorem 4: Connection between solutions of dual OT and UOT problems choi2024generative
  • proof : Proof of Theorem \ref{['thm-bary-uotot-connection']}
  • ...and 1 more