Table of Contents
Fetching ...

Multi-view Granular-ball Contrastive Clustering

Peng Su, Shudong Huang, Weihong Ma, Deng Xiong, Jiancheng Lv

TL;DR

This paper tackles multi-view clustering by addressing false negatives and neglect of local structure in existing instance- and cluster-level contrastive methods. It introduces Multi-view Granular-ball Contrastive Clustering (MGBCC), which represents data with coarse granules (granular balls) in a shared latent space to capture local topology across views. Granular balls are generated per view via direct k-means with a granularity parameter $p$, centers and radii are computed, and intra-view overlaps and cross-view intersections guide associations; a granular-ball contrastive loss $L_{con}$ and a reconstruction loss $L_{rec}$ are combined into a total objective $L_{total}$. Experiments on seven multi-view datasets show competitive or superior clustering performance, validating the method's ability to preserve local structure and improve discriminability.

Abstract

Previous multi-view contrastive learning methods typically operate at two scales: instance-level and cluster-level. Instance-level approaches construct positive and negative pairs based on sample correspondences, aiming to bring positive pairs closer and push negative pairs further apart in the latent space. Cluster-level methods focus on calculating cluster assignments for samples under each view and maximize view consensus by reducing distribution discrepancies, e.g., minimizing KL divergence or maximizing mutual information. However, these two types of methods either introduce false negatives, leading to reduced model discriminability, or overlook local structures and cannot measure relationships between clusters across views explicitly. To this end, we propose a method named Multi-view Granular-ball Contrastive Clustering (MGBCC). MGBCC segments the sample set into coarse-grained granular balls, and establishes associations between intra-view and cross-view granular balls. These associations are reinforced in a shared latent space, thereby achieving multi-granularity contrastive learning. Granular balls lie between instances and clusters, naturally preserving the local topological structure of the sample set. We conduct extensive experiments to validate the effectiveness of the proposed method.

Multi-view Granular-ball Contrastive Clustering

TL;DR

This paper tackles multi-view clustering by addressing false negatives and neglect of local structure in existing instance- and cluster-level contrastive methods. It introduces Multi-view Granular-ball Contrastive Clustering (MGBCC), which represents data with coarse granules (granular balls) in a shared latent space to capture local topology across views. Granular balls are generated per view via direct k-means with a granularity parameter , centers and radii are computed, and intra-view overlaps and cross-view intersections guide associations; a granular-ball contrastive loss and a reconstruction loss are combined into a total objective . Experiments on seven multi-view datasets show competitive or superior clustering performance, validating the method's ability to preserve local structure and improve discriminability.

Abstract

Previous multi-view contrastive learning methods typically operate at two scales: instance-level and cluster-level. Instance-level approaches construct positive and negative pairs based on sample correspondences, aiming to bring positive pairs closer and push negative pairs further apart in the latent space. Cluster-level methods focus on calculating cluster assignments for samples under each view and maximize view consensus by reducing distribution discrepancies, e.g., minimizing KL divergence or maximizing mutual information. However, these two types of methods either introduce false negatives, leading to reduced model discriminability, or overlook local structures and cannot measure relationships between clusters across views explicitly. To this end, we propose a method named Multi-view Granular-ball Contrastive Clustering (MGBCC). MGBCC segments the sample set into coarse-grained granular balls, and establishes associations between intra-view and cross-view granular balls. These associations are reinforced in a shared latent space, thereby achieving multi-granularity contrastive learning. Granular balls lie between instances and clusters, naturally preserving the local topological structure of the sample set. We conduct extensive experiments to validate the effectiveness of the proposed method.

Paper Structure

This paper contains 18 sections, 14 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Examples of granular balls
  • Figure 2: The framework of MGBCC. As shown, the overall loss function consists of two parts, e.g., reconstruction loss and granular-ball contrastive loss. We construct granular-ball sets $\{S^v\}_{v=1}^V$ for different views in the latent space and establish intra-view and cross-view associations based on overlap and intersection size respectively. Granular balls model the local structure of the dataset, and associated granular balls should be close to each other in the latent space.
  • Figure 3: The t-SNE visualization of the clustering results on MNIST-USPS dataset.
  • Figure 4: The clustering accuracy (%) with different parameters $p$ and $d$ on Caltech101-20 and Cora.
  • Figure 5: Loss vs. Metrics on Cora.