Table of Contents
Fetching ...

Decentralized Entropic Optimal Transport for Distributed Distribution Comparison

Xiangfeng Wang, Hongteng Xu, Moyi Yang

TL;DR

This work tackles distributed distribution comparison where data are partitioned across agents and cannot be shared. It introduces Decentralized Entropic Optimal Transport (DEOT), a dual-formulation approach that updates locally stored dual variables via a mini-batch randomized block-coordinate descent (MRBCD) and relies on a privacy-preserving decentralized kernel approximation to form the kernel $\bm{K}$. The paper provides finite-sample–theoretic guarantees by decomposing the overall error into protocol-mismatch, kernel-approximation, and algorithmic components, and it extends the framework to entropic Gromov-Wasserstein (EGW) with a bilinear alignment $\bm{P}$. Empirical results on synthetic data and real-world distributed domain adaptation demonstrate robust performance under various communication protocols and privacy constraints, highlighting practical privacy-utility tradeoffs and the method’s applicability to privacy-sensitive distributed learning tasks.

Abstract

Distributed distribution comparison aims to measure the distance between the distributions whose data are scattered across different agents in a distributed system and cannot even be shared directly among the agents. In this study, we propose a novel decentralized entropic optimal transport (DEOT) method, which provides a communication-efficient and privacy-preserving solution to this problem with theoretical guarantees. In particular, we design a mini-batch randomized block-coordinate descent (MRBCD) scheme to optimize the DEOT distance in its dual form. The dual variables are scattered across different agents and updated locally and iteratively with limited communications among partial agents. The kernel matrix involved in the gradients of the dual variables is estimated by a decentralized kernel approximation method, in which each agent only needs to approximate and store a sub-kernel matrix by one-shot communication and without sharing raw data. Besides computing entropic Wasserstein distance, we show that the proposed MRBCD scheme and kernel approximation method also apply to entropic Gromov-Wasserstein distance. We analyze our method's communication complexity and, under mild assumptions, provide a theoretical bound for the approximation error caused by the convergence error, the estimated kernel, and the mismatch between the storage and communication protocols. In addition, we discuss the trade-off between the precision of the EOT distance and the strength of privacy protection when implementing our method. Experiments on synthetic data and real-world distributed domain adaptation tasks demonstrate the effectiveness of our method.

Decentralized Entropic Optimal Transport for Distributed Distribution Comparison

TL;DR

This work tackles distributed distribution comparison where data are partitioned across agents and cannot be shared. It introduces Decentralized Entropic Optimal Transport (DEOT), a dual-formulation approach that updates locally stored dual variables via a mini-batch randomized block-coordinate descent (MRBCD) and relies on a privacy-preserving decentralized kernel approximation to form the kernel . The paper provides finite-sample–theoretic guarantees by decomposing the overall error into protocol-mismatch, kernel-approximation, and algorithmic components, and it extends the framework to entropic Gromov-Wasserstein (EGW) with a bilinear alignment . Empirical results on synthetic data and real-world distributed domain adaptation demonstrate robust performance under various communication protocols and privacy constraints, highlighting practical privacy-utility tradeoffs and the method’s applicability to privacy-sensitive distributed learning tasks.

Abstract

Distributed distribution comparison aims to measure the distance between the distributions whose data are scattered across different agents in a distributed system and cannot even be shared directly among the agents. In this study, we propose a novel decentralized entropic optimal transport (DEOT) method, which provides a communication-efficient and privacy-preserving solution to this problem with theoretical guarantees. In particular, we design a mini-batch randomized block-coordinate descent (MRBCD) scheme to optimize the DEOT distance in its dual form. The dual variables are scattered across different agents and updated locally and iteratively with limited communications among partial agents. The kernel matrix involved in the gradients of the dual variables is estimated by a decentralized kernel approximation method, in which each agent only needs to approximate and store a sub-kernel matrix by one-shot communication and without sharing raw data. Besides computing entropic Wasserstein distance, we show that the proposed MRBCD scheme and kernel approximation method also apply to entropic Gromov-Wasserstein distance. We analyze our method's communication complexity and, under mild assumptions, provide a theoretical bound for the approximation error caused by the convergence error, the estimated kernel, and the mismatch between the storage and communication protocols. In addition, we discuss the trade-off between the precision of the EOT distance and the strength of privacy protection when implementing our method. Experiments on synthetic data and real-world distributed domain adaptation tasks demonstrate the effectiveness of our method.
Paper Structure (23 sections, 3 theorems, 33 equations, 6 figures, 2 tables, 4 algorithms)

This paper contains 23 sections, 3 theorems, 33 equations, 6 figures, 2 tables, 4 algorithms.

Key Result

Lemma 1

Let $\mu=\sum_i p_i\mu_i$ and $\gamma=\sum_j q_j \gamma_j$ be the two distributions in a distributed system with $I$ source agents and $J$ target agents, whose storage and communication protocols are $\bm{pq}^{\top}=[p_iq_j]$ and $\bm{E}=[e_{ij}]$, respectively. If $\max_{i,j}W_{\varepsilon}(\mu_i,\

Figures (6)

  • Figure 1: An illustration of the distributed distribution comparison task and the corresponding decentralized entropic optimal transport problem. To emphasis, the data in two domains obey the distribution $\mu$ and $\gamma$, respectively, but are scattered among different agents. Each agent merely contains a part of samples, and the local distribution is denoted as $\mu_i$ (or $\gamma_j$). Communication between different agents is allowed, but it obeys the communication protocol relevant to the network topology. Additionally, in the privacy-preserving setting, the communication of raw data is forbidden.
  • Figure 2: An illustration of the proposed DEOT method.
  • Figure 3: In each subfigure, the block dotted line indicates the EOT computed by the classic Sinkhorn-scaling algorithm. The red, orange, and blue curves indicates the average convergence curves of our DEOT method when applying raw data or binary data, respectively. (a-c) show the results of computing $W_{\varepsilon}(\mathcal{N}_1,\mathcal{N}_2)$. (d-f) show the results of computing $W_{\varepsilon}(\mathcal{M}_1,\mathcal{M}_2)$ in the i.i.d. setting. (g-i) show the results of computing $W_{\varepsilon}(\mathcal{M}_1,\mathcal{M}_2)$ in the non-i.i.d. setting.
  • Figure 4: The RMAE of EOT distance and the RMSE of sample estimation with respect to $\frac{Q}{D}$.
  • Figure 5: In each subfigure, the block dotted line indicates the EOT computed by the classic Sinkhorn-scaling algorithm. The red, orange, and blue curves indicates the average convergence curves of our DEOT method under different communication protocols.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Lemma 1: Irreducible Estimation Error Caused by Mismatched Protocols.
  • proof
  • Lemma 2: Approximation Error of Kernel khanduri2021decentralized
  • proof
  • Lemma 3