Table of Contents
Fetching ...

Information Recovery-Driven Deep Incomplete Multiview Clustering Network

Chengliang Liu, Jie Wen, Zhihao Wu, Xiaoling Luo, Chao Huang, Yong Xu

TL;DR

This work tackles incomplete multi-view clustering by introducing RecFormer, a two-stage transformer-style autoencoder that jointly learns cross-view semantics and recovers missing views. A cross-view encoder produces integrated embeddings, while a recurrent graph constraint leverages an imputed, approximately complete graph to regularize feature learning and view recovery. Stage 1 performs missing-view restoration; Stage 2 clusters the fused representation $\\bar{Z}$ using $K$-means, informed by recovered data. Across five datasets and multiple missing-rate scenarios, RecFormer achieves consistent improvements over state-of-the-art methods, demonstrating robustness and practical impact for incomplete multi-view data.

Abstract

Incomplete multi-view clustering is a hot and emerging topic. It is well known that unavoidable data incompleteness greatly weakens the effective information of multi-view data. To date, existing incomplete multi-view clustering methods usually bypass unavailable views according to prior missing information, which is considered as a second-best scheme based on evasion. Other methods that attempt to recover missing information are mostly applicable to specific two-view datasets. To handle these problems, in this paper, we propose an information recovery-driven deep incomplete multi-view clustering network, termed as RecFormer. Concretely, a two-stage autoencoder network with the self-attention structure is built to synchronously extract high-level semantic representations of multiple views and recover the missing data. Besides, we develop a recurrent graph reconstruction mechanism that cleverly leverages the restored views to promote the representation learning and the further data reconstruction. Visualization of recovery results are given and sufficient experimental results confirm that our RecFormer has obvious advantages over other top methods.

Information Recovery-Driven Deep Incomplete Multiview Clustering Network

TL;DR

This work tackles incomplete multi-view clustering by introducing RecFormer, a two-stage transformer-style autoencoder that jointly learns cross-view semantics and recovers missing views. A cross-view encoder produces integrated embeddings, while a recurrent graph constraint leverages an imputed, approximately complete graph to regularize feature learning and view recovery. Stage 1 performs missing-view restoration; Stage 2 clusters the fused representation using -means, informed by recovered data. Across five datasets and multiple missing-rate scenarios, RecFormer achieves consistent improvements over state-of-the-art methods, demonstrating robustness and practical impact for incomplete multi-view data.

Abstract

Incomplete multi-view clustering is a hot and emerging topic. It is well known that unavoidable data incompleteness greatly weakens the effective information of multi-view data. To date, existing incomplete multi-view clustering methods usually bypass unavailable views according to prior missing information, which is considered as a second-best scheme based on evasion. Other methods that attempt to recover missing information are mostly applicable to specific two-view datasets. To handle these problems, in this paper, we propose an information recovery-driven deep incomplete multi-view clustering network, termed as RecFormer. Concretely, a two-stage autoencoder network with the self-attention structure is built to synchronously extract high-level semantic representations of multiple views and recover the missing data. Besides, we develop a recurrent graph reconstruction mechanism that cleverly leverages the restored views to promote the representation learning and the further data reconstruction. Visualization of recovery results are given and sufficient experimental results confirm that our RecFormer has obvious advantages over other top methods.
Paper Structure (18 sections, 11 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 18 sections, 11 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: Missing view completion.
  • Figure 2: Main framework of our model. The FC module means Fully Connected layer and there is a mask operation in the first cross-view encoder. In Stage 1, incomplete multi-view data is input to the encoder to extract view-specific embedding features $\bm{\mathsf{Z}}$, and these incomplete features are fused to a complete fused representation $\bar{Z}$. Then, the decoder reconstructs original features to obtain the predicted missing views. In stage 2, the entire model performs complete multi-view clustering to obtain the final result $Y$.
  • Figure 3: The visual example pairs about missing views and their restoration results. The (a) shows the missing views sampling from the NH_face and Handwritten databases; The (b) denotes corresponding views recovered by our RecFormer.
  • Figure 4: Feature space visualization of final clustering representations of different methods via t-SNE on the Handwritten dataset with a 30% incomplete rate.
  • Figure 5: The ACC and Loss curves on the Handwritten dataset and Caltech7 dataset with a 50% missing rate in Stage 2.
  • ...and 1 more figures