Table of Contents
Fetching ...

SLRL: Structured Latent Representation Learning for Multi-view Clustering

Zhangci Xiong, Meng Cao

TL;DR

This paper tackles multi-view clustering by addressing both cross-view consistency and sample-level structure. It introduces Structured Latent Representation Learning (SLRL), which first learns a common latent space $m{H}$ from all views, constructs a $k$-nearest neighbor graph on $m{H}$, and uses a Graph Attention Network to obtain a structured embedding $ ilde{m{H}}$, guided by a KL-divergence clustering objective. The model optimizes a joint loss $L = L_r + oldsymbol{b3} L_c$, where $L_r$ is a reconstruction loss and $L_c$ enforces cluster-friendly structure, enabling end-to-end training. Experiments on six datasets show SLRL achieves superior clustering performance, demonstrating the benefit of incorporating structural information into MVC for improved clustering accuracy and robustness.

Abstract

In recent years, Multi-View Clustering (MVC) has attracted increasing attention for its potential to reduce the annotation burden associated with large datasets. The aim of MVC is to exploit the inherent consistency and complementarity among different views, thereby integrating information from multiple perspectives to improve clustering outcomes. Despite extensive research in MVC, most existing methods focus predominantly on harnessing complementary information across views to enhance clustering effectiveness, often neglecting the structural information among samples, which is crucial for exploring sample correlations. To address this gap, we introduce a novel framework, termed Structured Latent Representation Learning based Multi-View Clustering method (SLRL). SLRL leverages both the complementary and structural information. Initially, it learns a common latent representation for all views. Subsequently, to exploit the structural information among samples, a k-nearest neighbor graph is constructed from this common latent representation. This graph facilitates enhanced sample interaction through graph learning techniques, leading to a structured latent representation optimized for clustering. Extensive experiments demonstrate that SLRL not only competes well with existing methods but also sets new benchmarks in various multi-view datasets.

SLRL: Structured Latent Representation Learning for Multi-view Clustering

TL;DR

This paper tackles multi-view clustering by addressing both cross-view consistency and sample-level structure. It introduces Structured Latent Representation Learning (SLRL), which first learns a common latent space from all views, constructs a -nearest neighbor graph on , and uses a Graph Attention Network to obtain a structured embedding , guided by a KL-divergence clustering objective. The model optimizes a joint loss , where is a reconstruction loss and enforces cluster-friendly structure, enabling end-to-end training. Experiments on six datasets show SLRL achieves superior clustering performance, demonstrating the benefit of incorporating structural information into MVC for improved clustering accuracy and robustness.

Abstract

In recent years, Multi-View Clustering (MVC) has attracted increasing attention for its potential to reduce the annotation burden associated with large datasets. The aim of MVC is to exploit the inherent consistency and complementarity among different views, thereby integrating information from multiple perspectives to improve clustering outcomes. Despite extensive research in MVC, most existing methods focus predominantly on harnessing complementary information across views to enhance clustering effectiveness, often neglecting the structural information among samples, which is crucial for exploring sample correlations. To address this gap, we introduce a novel framework, termed Structured Latent Representation Learning based Multi-View Clustering method (SLRL). SLRL leverages both the complementary and structural information. Initially, it learns a common latent representation for all views. Subsequently, to exploit the structural information among samples, a k-nearest neighbor graph is constructed from this common latent representation. This graph facilitates enhanced sample interaction through graph learning techniques, leading to a structured latent representation optimized for clustering. Extensive experiments demonstrate that SLRL not only competes well with existing methods but also sets new benchmarks in various multi-view datasets.
Paper Structure (17 sections, 10 equations, 4 figures, 2 tables, 1 algorithm)

This paper contains 17 sections, 10 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: SLRL model framework.
  • Figure 2: t-SNE visualization of various methods on MSRCV1.
  • Figure 3: Sensitivity experiments for parameters $k$ and $\gamma$.
  • Figure 4: Convergent behavior of SLRL.