Table of Contents
Fetching ...

Towards Learnable Anchor for Deep Multi-View Clustering

Bocheng Wang, Chusheng Zeng, Mulin Chen, Xuelong Li

TL;DR

DMAC tackles the quadratic-time bottleneck in deep multi-view clustering by introducing learnable anchors learned via a perturbation-driven mechanism. It combines an anchor graph convolution per view with a mutual-information-based cross-view consistency objective to produce clustering-oriented anchors and a discriminative fusion embedding, enabling final clustering by k-means on a linearly scaling representation. The key contributions include (i) a differentiable Perturbation Generation Network that refines anchors during training, (ii) an Anchor Graph Convolution Network that derives per-view anchor-clustering distributions, (iii) cross-view alignment through Mutual Information, and (iv) a structure-preserving, regularized objective with proven linear time complexity $O(n)$ under typical settings. Experiments on six real-world datasets demonstrate that DMAC not only achieves state-of-the-art clustering accuracy and NMI but also significantly improves efficiency by avoiding full-sample graphs, confirming its practical impact for scalable multi-view clustering.

Abstract

Deep multi-view clustering incorporating graph learning has presented tremendous potential. Most methods encounter costly square time consumption w.r.t. data size. Theoretically, anchor-based graph learning can alleviate this limitation, but related deep models mainly rely on manual discretization approaches to select anchors, which indicates that 1) the anchors are fixed during model training and 2) they may deviate from the true cluster distribution. Consequently, the unreliable anchors may corrupt clustering results. In this paper, we propose the Deep Multi-view Anchor Clustering (DMAC) model that performs clustering in linear time. Concretely, the initial anchors are intervened by the positive-incentive noise sampled from Gaussian distribution, such that they can be optimized with a newly designed anchor learning loss, which promotes a clear relationship between samples and anchors. Afterwards, anchor graph convolution is devised to model the cluster structure formed by the anchors, and the mutual information maximization loss is built to provide cross-view clustering guidance. In this way, the learned anchors can better represent clusters. With the optimal anchors, the full sample graph is calculated to derive a discriminative embedding for clustering. Extensive experiments on several datasets demonstrate the superior performance and efficiency of DMAC compared to state-of-the-art competitors.

Towards Learnable Anchor for Deep Multi-View Clustering

TL;DR

DMAC tackles the quadratic-time bottleneck in deep multi-view clustering by introducing learnable anchors learned via a perturbation-driven mechanism. It combines an anchor graph convolution per view with a mutual-information-based cross-view consistency objective to produce clustering-oriented anchors and a discriminative fusion embedding, enabling final clustering by k-means on a linearly scaling representation. The key contributions include (i) a differentiable Perturbation Generation Network that refines anchors during training, (ii) an Anchor Graph Convolution Network that derives per-view anchor-clustering distributions, (iii) cross-view alignment through Mutual Information, and (iv) a structure-preserving, regularized objective with proven linear time complexity under typical settings. Experiments on six real-world datasets demonstrate that DMAC not only achieves state-of-the-art clustering accuracy and NMI but also significantly improves efficiency by avoiding full-sample graphs, confirming its practical impact for scalable multi-view clustering.

Abstract

Deep multi-view clustering incorporating graph learning has presented tremendous potential. Most methods encounter costly square time consumption w.r.t. data size. Theoretically, anchor-based graph learning can alleviate this limitation, but related deep models mainly rely on manual discretization approaches to select anchors, which indicates that 1) the anchors are fixed during model training and 2) they may deviate from the true cluster distribution. Consequently, the unreliable anchors may corrupt clustering results. In this paper, we propose the Deep Multi-view Anchor Clustering (DMAC) model that performs clustering in linear time. Concretely, the initial anchors are intervened by the positive-incentive noise sampled from Gaussian distribution, such that they can be optimized with a newly designed anchor learning loss, which promotes a clear relationship between samples and anchors. Afterwards, anchor graph convolution is devised to model the cluster structure formed by the anchors, and the mutual information maximization loss is built to provide cross-view clustering guidance. In this way, the learned anchors can better represent clusters. With the optimal anchors, the full sample graph is calculated to derive a discriminative embedding for clustering. Extensive experiments on several datasets demonstrate the superior performance and efficiency of DMAC compared to state-of-the-art competitors.

Paper Structure

This paper contains 30 sections, 1 theorem, 19 equations, 5 figures, 3 tables.

Key Result

Theorem 1

Minimizing Eq. (eq:anchor_learning) is equivalent to penalizing the trivial solution (i.e., representation collapse) to Eq. (eq:sp_loss).

Figures (5)

  • Figure 1: Pipeline of DMAC. Note that the encoders are omitted. For the $a$-th view, $\mathbf{Z}^{(a)}$ is the data embedding, $\mathbf{A}^{(a)}$ is the anchor graph, $\mathrm{AGCN}_a$ is the corresponding anchor graph convolution network, and $\mathbf{F}^{(a)}$ is the anchor clustering distribution that records the probability of an anchor belonging to each cluster. $\mathbf{Z}$ is the shared fusion embedding among views. $\mathbf{U}$ represents the learnable anchors injected with the perturbation. The overall framework is updated by minimizing Eq. (\ref{['eq:overall_loss']}). The final result is gained by performing $k$-means on the convergent $\mathbf{Z}$.
  • Figure 2: Runtime (s) of deep models on four datasets. Note that all records are converted by logarithmic base 2.
  • Figure 3: Anchor similarity matrix $\mathbf{U}\mathbf{U}^{\mathrm{T}}$ on BBC.
  • Figure 4: Visualization of fusion embedding $\mathbf{Z}$ on BBC. Each point is drawn as its actual label value.
  • Figure 5: ACC of DMAC with different parameters $\alpha$ and $\beta$.

Theorems & Definitions (3)

  • Definition 1
  • Theorem 1
  • proof