Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport
Milena Gazdieva, Jaemoo Choi, Alexander Kolesov, Jaewoong Choi, Petr Mokrov, Alexander Korotin
TL;DR
This work addresses robust aggregation of multiple source distributions by proposing a continuous semi-unbalanced OT (SUOT) barycenter framework. It derives a scalable max–min dual formulation, enforces an $m$-congruence condition on potentials, and parameterizes the solution with neural networks to handle high-dimensional data. The method demonstrates robustness to outliers and class imbalance, with theoretical guarantees on the recovered plans and practical validation on synthetic and image-like tasks, including a ShapeGAN-style experiment. By operating in the continuous, data-driven setting and providing rejection-based inference, it offers a flexible and scalable tool for robust data fusion across heterogeneous sources. The accompanying code further enables reproducibility and application to real-world problems demanding robust distribution averaging.
Abstract
Aggregating data from multiple sources can be formalized as an Optimal Transport (OT) barycenter problem, which seeks to compute the average of probability distributions with respect to OT discrepancies. However, in real-world scenarios, the presence of outliers and noise in the data measures can significantly hinder the performance of traditional statistical methods for estimating OT barycenters. To address this issue, we propose a novel scalable approach for estimating the robust continuous barycenter, leveraging the dual formulation of the (semi-)unbalanced OT problem. To the best of our knowledge, this paper is the first attempt to develop an algorithm for robust barycenters under the continuous distribution setup. Our method is framed as a min-max optimization problem and is adaptable to general cost functions. We rigorously establish the theoretical underpinnings of the proposed method and demonstrate its robustness to outliers and class imbalance through a number of illustrative experiments. Our source code is publicly available at https://github.com/milenagazdieva/U-NOTBarycenters.
