Table of Contents
Fetching ...

Locally Regularized Sparse Graph by Fast Proximal Gradient Descent

Dongfang Sun, Yingzhen Yang

TL;DR

SRSG addresses clustering by incorporating local geometric structure into sparse graphs used for spectral clustering. It introduces a support-distance based regularizer $R_{\mathbf{S}}({\mathbf{Z}})$ and optimizes the nonconvex objective with a Fast Proximal Gradient Descent with Support Projection (FPGD-SP), achieving a locally optimal $O(1/k^2)$ rate. Initialization uses a vanilla sparse graph's sparse codes, and SRSG is solved via coordinate descent with FPGD-SP at each step. Across diverse datasets, SRSG consistently outperforms standard sparse-graph methods, demonstrating improved robustness to noise and better capture of manifold structure.

Abstract

Sparse graphs built by sparse representation has been demonstrated to be effective in clustering high-dimensional data. Albeit the compelling empirical performance, the vanilla sparse graph ignores the geometric information of the data by performing sparse representation for each datum separately. In order to obtain a sparse graph aligned with the local geometric structure of data, we propose a novel Support Regularized Sparse Graph, abbreviated as SRSG, for data clustering. SRSG encourages local smoothness on the neighborhoods of nearby data points by a well-defined support regularization term. We propose a fast proximal gradient descent method to solve the non-convex optimization problem of SRSG with the convergence matching the Nesterov's optimal convergence rate of first-order methods on smooth and convex objective function with Lipschitz continuous gradient. Extensive experimental results on various real data sets demonstrate the superiority of SRSG over other competing clustering methods.

Locally Regularized Sparse Graph by Fast Proximal Gradient Descent

TL;DR

SRSG addresses clustering by incorporating local geometric structure into sparse graphs used for spectral clustering. It introduces a support-distance based regularizer and optimizes the nonconvex objective with a Fast Proximal Gradient Descent with Support Projection (FPGD-SP), achieving a locally optimal rate. Initialization uses a vanilla sparse graph's sparse codes, and SRSG is solved via coordinate descent with FPGD-SP at each step. Across diverse datasets, SRSG consistently outperforms standard sparse-graph methods, demonstrating improved robustness to noise and better capture of manifold structure.

Abstract

Sparse graphs built by sparse representation has been demonstrated to be effective in clustering high-dimensional data. Albeit the compelling empirical performance, the vanilla sparse graph ignores the geometric information of the data by performing sparse representation for each datum separately. In order to obtain a sparse graph aligned with the local geometric structure of data, we propose a novel Support Regularized Sparse Graph, abbreviated as SRSG, for data clustering. SRSG encourages local smoothness on the neighborhoods of nearby data points by a well-defined support regularization term. We propose a fast proximal gradient descent method to solve the non-convex optimization problem of SRSG with the convergence matching the Nesterov's optimal convergence rate of first-order methods on smooth and convex objective function with Lipschitz continuous gradient. Extensive experimental results on various real data sets demonstrate the superiority of SRSG over other competing clustering methods.
Paper Structure (15 sections, 2 theorems, 34 equations, 3 figures, 3 tables, 2 algorithms)

This paper contains 15 sections, 2 theorems, 34 equations, 3 figures, 3 tables, 2 algorithms.

Key Result

Theorem 3.2

Let $\{{\mathbf{z}}^{(k)}\}$ be the sequence generated by Algorithm (alg:fpgd-sp), and suppose that there exists a constant $G$ such that ${\left\|\nabla f(\bm^{(k)})\right\|}_{2} \le G$ for all $k \ge 1$. Suppose $s < \min\left\{ \frac{2\tau}{G^2},\frac{1}{L_f} \right\}$ with $L_f \coloneqq 2 \sigm where $\mathbf{U}^{(k_0)} \coloneqq k_0(k_0-1)\left( {\tilde{F}}(\mathbf{z}^{(k_0-1)}) -{\tilde{F}}

Figures (3)

  • Figure 1: During the construction of support regularized sparse graph, point $\mathbf{x}_i$ is among the $K$ nearest neighbors of $\mathbf{x}_t$ and $\mathbf{x}_j$. ${\mathbf{Z}}^t$ and ${\mathbf{Z}}^j$ have the same support denoted by the three black dots ($\mathbf{x}_{k_1}$, $\mathbf{x}_{k_2}$ and $\mathbf{x}_{k_3}$), suggesting the correct neighbors of $\mathbf{x}_i$. By penalizing support distance between nearby points, $\mathbf{x}_i$ is encouraged to choose the three black dots as neighbors in the sparse graph while discarding the wrong neighbors marked in red.
  • Figure 2: Parameter sensitivey on the UMIST Face Data, from left to right: Accuracy with respect to different values of $\gamma$; NMI with respect to different values of $\gamma$; Accuracy with respect to different values of $K$; NMI with respect to different values of $K$
  • Figure 3: The comparison between the weighed adjacency matrix $W$ of the sparse graph produced by $\ell^{1}$-graph (right) and SRSG (left) on the Extended Yale Face Database B, where each white dot indicates an edge in the sparse graph.

Theorems & Definitions (5)

  • Definition 3.1
  • Theorem 3.2
  • Proposition A.1
  • proof
  • proof : Proof of Theorem 3.2