Table of Contents
Fetching ...

MCFCN: Multi-View Clustering via a Fusion-Consensus Graph Convolutional Network

Chenping Pei, Fadi Dornaika, Jingjun Bi

TL;DR

MCFCN introduces an end-to-end framework for multi-view clustering that learns a consensus graph and consensus representations from multiple views. It combines a Multi-View Feature Fusion Module with a Unified Graph Structure Adapter to obtain a fused feature $oldsymbol{F}_f$ and fused adjacency $oldsymbol{A}_f$, then uses a three-layer GCN to produce robust node representations, guided by SMAL and FRAL losses and enhanced by a multi-view kernel-k-means and a spectral clustering objective. The model is trained jointly to align cross-view topologies and representations, yielding state-of-the-art clustering performance on eight benchmarks and robust, interpretable consensus graphs. The authors provide extensive ablations and qualitative visualizations, validating the contribution of each component and offering code for reproducibility.

Abstract

Existing Multi-view Clustering (MVC) methods based on subspace learning focus on consensus representation learning while neglecting the inherent topological structure of data. Despite the integration of Graph Neural Networks (GNNs) into MVC, their input graph structures remain susceptible to noise interference. Methods based on Multi-view Graph Refinement (MGRC) also have limitations such as insufficient consideration of cross-view consistency, difficulty in handling hard-to-distinguish samples in the feature space, and disjointed optimization processes caused by graph construction algorithms. To address these issues, a Multi-View Clustering method via a Fusion-Consensus Graph Convolutional Network (MCFCN) is proposed. The network learns the consensus graph of multi-view data in an end-to-end manner and learns effective consensus representations through a view feature fusion model and a Unified Graph Structure Adapter (UGA). It designs Similarity Matrix Alignment Loss (SMAL) and Feature Representation Alignment Loss (FRAL). With the guidance of consensus, it optimizes view-specific graphs, preserves cross-view topological consistency, promotes the construction of intra-class edges, and realizes effective consensus representation learning with the help of GCN to improve clustering performance. MCFCN demonstrates state-of-the-art performance on eight multi-view benchmark datasets, and its effectiveness is verified by extensive qualitative and quantitative implementations. The code will be provided at https://github.com/texttao/MCFCN.

MCFCN: Multi-View Clustering via a Fusion-Consensus Graph Convolutional Network

TL;DR

MCFCN introduces an end-to-end framework for multi-view clustering that learns a consensus graph and consensus representations from multiple views. It combines a Multi-View Feature Fusion Module with a Unified Graph Structure Adapter to obtain a fused feature and fused adjacency , then uses a three-layer GCN to produce robust node representations, guided by SMAL and FRAL losses and enhanced by a multi-view kernel-k-means and a spectral clustering objective. The model is trained jointly to align cross-view topologies and representations, yielding state-of-the-art clustering performance on eight benchmarks and robust, interpretable consensus graphs. The authors provide extensive ablations and qualitative visualizations, validating the contribution of each component and offering code for reproducibility.

Abstract

Existing Multi-view Clustering (MVC) methods based on subspace learning focus on consensus representation learning while neglecting the inherent topological structure of data. Despite the integration of Graph Neural Networks (GNNs) into MVC, their input graph structures remain susceptible to noise interference. Methods based on Multi-view Graph Refinement (MGRC) also have limitations such as insufficient consideration of cross-view consistency, difficulty in handling hard-to-distinguish samples in the feature space, and disjointed optimization processes caused by graph construction algorithms. To address these issues, a Multi-View Clustering method via a Fusion-Consensus Graph Convolutional Network (MCFCN) is proposed. The network learns the consensus graph of multi-view data in an end-to-end manner and learns effective consensus representations through a view feature fusion model and a Unified Graph Structure Adapter (UGA). It designs Similarity Matrix Alignment Loss (SMAL) and Feature Representation Alignment Loss (FRAL). With the guidance of consensus, it optimizes view-specific graphs, preserves cross-view topological consistency, promotes the construction of intra-class edges, and realizes effective consensus representation learning with the help of GCN to improve clustering performance. MCFCN demonstrates state-of-the-art performance on eight multi-view benchmark datasets, and its effectiveness is verified by extensive qualitative and quantitative implementations. The code will be provided at https://github.com/texttao/MCFCN.

Paper Structure

This paper contains 25 sections, 29 equations, 8 figures, 6 tables, 1 algorithm.

Figures (8)

  • Figure 1: The overall architecture of MFGCC. First, we perform linear transformation on each view, unify the feature dimensions, and conduct feature fusion. Additionally, a Unified Graph Structure Adapter is introduced to generate learnable graphs for the subsequent GCN, thereby facilitating the joint optimization of graph structures and their corresponding representations. After that, we extract features through the GCN to obtain the consensus vector representation, concatenate it with the output of the intermediate layer of the GCN, and then use the K-means algorithm to get the final clustering results. In addition, we design a loss function to guide the model to learn the view-consistent topological structure and its corresponding feature representations, so as to improve the model performance.
  • Figure 2: Visualization of the consensus graph in the BBCSport, 100Leaves, Mfeat and Caltech101-7.
  • Figure 3: The t-SNE visualization of the raw features and the learned representation by different methods on the 3Sources dataset.
  • Figure 4: The t-SNE visualization of the raw features and the learned representation by different methods on the 100Leaves.
  • Figure 5: The t-SNE visualization of the raw features and the learned representation by different methods on the Caltech101-7.
  • ...and 3 more figures