Table of Contents
Fetching ...

Dual-Bounded Nonlinear Optimal Transport for Size Constrained Min Cut Clustering

Fangyuan Xie, Jinghui Yuan, Feiping Nie, Xuelong Li

TL;DR

This work targets size-constrained Min Cut clustering by recasting it as a dual-bounded nonlinear optimal transport problem and solving it with a new Dual-bounded Nonlinear Frank-Wolfe (DNF) method. The authors introduce the DB-NOT formulation, proving convergence guarantees for convex and Lipschitz-smooth non-convex objectives, and apply it to the Min Cut objective with gradient $-2SF$. The main contributions include the DB-NOT framework, the DNF algorithm with norm-based and inner product-based gradient approximation, and a corollary giving a convergence bound for the size-constrained Min Cut problem. Empirical results on eight real-world datasets show state-of-the-art performance in clustering quality and speed, with a parameter-free, stable optimization process, suggesting broad applicability to nonlinear bounded OT problems in graph clustering and beyond.

Abstract

Min cut is an important graph partitioning method. However, current solutions to the min cut problem suffer from slow speeds, difficulty in solving, and often converge to simple solutions. To address these issues, we relax the min cut problem into a dual-bounded constraint and, for the first time, treat the min cut problem as a dual-bounded nonlinear optimal transport problem. Additionally, we develop a method for solving dual-bounded nonlinear optimal transport based on the Frank-Wolfe method (abbreviated as DNF). Notably, DNF not only solves the size constrained min cut problem but is also applicable to all dual-bounded nonlinear optimal transport problems. We prove that for convex problems satisfying Lipschitz smoothness, the DNF method can achieve a convergence rate of \(\mathcal{O}(\frac{1}{t})\). We apply the DNF method to the min cut problem and find that it achieves state-of-the-art performance in terms of both the loss function and clustering accuracy at the fastest speed, with a convergence rate of \(\mathcal{O}(\frac{1}{\sqrt{t}})\). Moreover, the DNF method for the size constrained min cut problem requires no parameters and exhibits better stability.

Dual-Bounded Nonlinear Optimal Transport for Size Constrained Min Cut Clustering

TL;DR

This work targets size-constrained Min Cut clustering by recasting it as a dual-bounded nonlinear optimal transport problem and solving it with a new Dual-bounded Nonlinear Frank-Wolfe (DNF) method. The authors introduce the DB-NOT formulation, proving convergence guarantees for convex and Lipschitz-smooth non-convex objectives, and apply it to the Min Cut objective with gradient . The main contributions include the DB-NOT framework, the DNF algorithm with norm-based and inner product-based gradient approximation, and a corollary giving a convergence bound for the size-constrained Min Cut problem. Empirical results on eight real-world datasets show state-of-the-art performance in clustering quality and speed, with a parameter-free, stable optimization process, suggesting broad applicability to nonlinear bounded OT problems in graph clustering and beyond.

Abstract

Min cut is an important graph partitioning method. However, current solutions to the min cut problem suffer from slow speeds, difficulty in solving, and often converge to simple solutions. To address these issues, we relax the min cut problem into a dual-bounded constraint and, for the first time, treat the min cut problem as a dual-bounded nonlinear optimal transport problem. Additionally, we develop a method for solving dual-bounded nonlinear optimal transport based on the Frank-Wolfe method (abbreviated as DNF). Notably, DNF not only solves the size constrained min cut problem but is also applicable to all dual-bounded nonlinear optimal transport problems. We prove that for convex problems satisfying Lipschitz smoothness, the DNF method can achieve a convergence rate of \(\mathcal{O}(\frac{1}{t})\). We apply the DNF method to the min cut problem and find that it achieves state-of-the-art performance in terms of both the loss function and clustering accuracy at the fastest speed, with a convergence rate of \(\mathcal{O}(\frac{1}{\sqrt{t}})\). Moreover, the DNF method for the size constrained min cut problem requires no parameters and exhibits better stability.

Paper Structure

This paper contains 37 sections, 18 theorems, 111 equations, 10 figures, 1 table, 4 algorithms.

Key Result

Theorem 4.2

The size constrained MC problem is a $2\|S\|_F$-smooth dual-bounded nonlinear optimal transport problem, i.e., $\max_{F \in \Omega} J_{\text{MC}} \in P_{DB}^{2\|S\|_F}$. Proof in 4.2.

Figures (10)

  • Figure 1: Comparison of inner product and norm-based measure in gradient approximation.
  • Figure 2: The clustering distribution with lower and upper bounds. (a) PalmData25. (b) USPS20. (c) Waveform21. (d) MnistData05.
  • Figure 3: Change of distribution of element values in indicator matrix during the iteration process for MnistData05 dataset.
  • Figure 4: Variation of objective function values with the number of iterations. (a) PalmData25. (b) MnistData05.
  • Figure 5: The clustering distribution with lower and upper bounds. (a) COIL20. (b) Digit. (c) JAFFE. (d) MSRA25.
  • ...and 5 more figures

Theorems & Definitions (33)

  • Definition 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Theorem 4.4
  • Theorem 4.5
  • Theorem 4.6
  • Definition 4.7
  • Definition 4.8
  • Theorem 4.9
  • Theorem 4.10
  • ...and 23 more