Table of Contents
Fetching ...

Robust Graph-Based Semi-Supervised Learning via $p$-Conductances

Sawyer Jack Robertson, Chester Holtz, Zhengchao Wan, Gal Mishne, Alexander Cloninger

TL;DR

An approach called p-conductance learning is proposed that generalizes the Laplace and Poisson learning methods by introducing an objective reminiscent of Laplacian regularization and an affine relaxation of the label constraints, which leads to a family of probability measure mincut programs that balance sparse edge removal with accurate distribution separation.

Abstract

We study the problem of semi-supervised learning on graphs in the regime where data labels are scarce or possibly corrupted. We propose an approach called $p$-conductance learning that generalizes the $p$-Laplace and Poisson learning methods by introducing an objective reminiscent of $p$-Laplacian regularization and an affine relaxation of the label constraints. This leads to a family of probability measure mincut programs that balance sparse edge removal with accurate distribution separation. Our theoretical analysis connects these programs to well-known variational and probabilistic problems on graphs (including randomized cuts, effective resistance, and Wasserstein distance) and provides motivation for robustness when labels are diffused via the heat kernel. Computationally, we develop a semismooth Newton-conjugate gradient algorithm and extend it to incorporate class-size estimates when converting the continuous solutions into label assignments. Empirical results on computer vision and citation datasets demonstrate that our approach achieves state-of-the-art accuracy in low label-rate, corrupted-label, and partial-label regimes.

Robust Graph-Based Semi-Supervised Learning via $p$-Conductances

TL;DR

An approach called p-conductance learning is proposed that generalizes the Laplace and Poisson learning methods by introducing an objective reminiscent of Laplacian regularization and an affine relaxation of the label constraints, which leads to a family of probability measure mincut programs that balance sparse edge removal with accurate distribution separation.

Abstract

We study the problem of semi-supervised learning on graphs in the regime where data labels are scarce or possibly corrupted. We propose an approach called -conductance learning that generalizes the -Laplace and Poisson learning methods by introducing an objective reminiscent of -Laplacian regularization and an affine relaxation of the label constraints. This leads to a family of probability measure mincut programs that balance sparse edge removal with accurate distribution separation. Our theoretical analysis connects these programs to well-known variational and probabilistic problems on graphs (including randomized cuts, effective resistance, and Wasserstein distance) and provides motivation for robustness when labels are diffused via the heat kernel. Computationally, we develop a semismooth Newton-conjugate gradient algorithm and extend it to incorporate class-size estimates when converting the continuous solutions into label assignments. Empirical results on computer vision and citation datasets demonstrate that our approach achieves state-of-the-art accuracy in low label-rate, corrupted-label, and partial-label regimes.

Paper Structure

This paper contains 24 sections, 11 theorems, 91 equations, 7 figures, 6 tables, 3 algorithms.

Key Result

Theorem 2.1

Assume $G$ is connected and let $\mu,\nu\in\mathsf{P}(V)$. The program $\mathcal{C}_1(\mu,\nu)$ is realizable as $\mathtt{mincut}(\mu,\nu)$, which is a linear program in the variables $k=(k_{ij})\in \mathbb{R}^{2N}$ and $\psi\in\mathbb{R}^n$: Moreover, $\mathtt{mincut}(\mu,\nu)$ admits a dual formulation in the variables $J=(J_{ij})\in \mathbb{R}^{E'}$ and $f\in\mathbb{R}$: Here, $\widetilde{B}\

Figures (7)

  • Figure 1: (Left) One-vs-all measure mincut on the Sklearn Digits dataset digits. The initially labeled nodes are shown with images overlaid. $\mu$ is given by the five images of the digit six, and $\nu$ from all of the other classes. Solving the program \ref{['eq:bcut-intro']} for $p=1$, we obtain a sparse solution $\phi$ such that each $\{i, j\}\in E$ satisfies $|\phi_i-\phi_j| \approx 0$ (light gray) or $|\phi_i-\phi_j| \approx 0.11$ (red). (Right) Removing red edges isolates a connected component with near-perfect class separation.
  • Figure 2: Visualizing solutions $\phi$ to ${\mathcal{C}_p}$ for $p \in [1,\infty]$. Measures $\mu, \nu$ appear in blue and red, with opacity proportional to value. Edge width is proportional to $|\phi_i - \phi_j|$. "X" symbols mark edges crossing $\{\phi \le T\}$ and $\{\phi > T\}$, where $T$ is the mean of $\phi$.
  • Figure 3: Accuracy on CIFAR-10 for various diffusion times and robustness. Shades denote noise level. Label rate is 10 labels per class.
  • Figure 4: Convergence of KKT residuals on lattice data for $p=5$
  • Figure 5: MNIST experiments
  • ...and 2 more figures

Theorems & Definitions (28)

  • Theorem 2.1: Generalized Max Flow - Min Cut Theorem
  • Theorem 2.2: $\mathcal{C}_1$ via randomized cuts
  • Remark 2.3: $\mathcal{C}_2$ via normalized cuts
  • Theorem 2.4
  • Corollary 2.5
  • Corollary 2.6
  • Theorem 2.7
  • Remark 2.8
  • Remark 3.1
  • Remark 3.2
  • ...and 18 more