Self Supervised Correlation-based Permutations for Multi-View Clustering
Ran Eisenberg, Jonathan Svirsky, Ofir Lindenbaum
TL;DR
The paper tackles end-to-end multi-view clustering for general data types by learning fused representations through a permutation-based canonical correlation objective, enabling clustering without a separate representation-learning stage. COPER jointly optimizes deep Canonically Correlated Encoders with a self-supervised multi-view pseudo-labeling and within-cluster permutation scheme, yielding representations that approximate the projection of supervised Linear Discriminant Analysis (LDA) under mild assumptions. Theoretical results establish an LDA approximation and bound the eigenvalue error due to pseudo-label noise; empirically COPER outperforms state-of-the-art deep MVC models on ten datasets and scales to large data. The approach is versatile across image and tabular data, and its permutation-based augmentation provides a general, potentially more effective alternative to standard CCA-based MVC.
Abstract
Combining data from different sources can improve data analysis tasks such as clustering. However, most of the current multi-view clustering methods are limited to specific domains or rely on a suboptimal and computationally intensive two-stage process of representation learning and clustering. We propose an end-to-end deep learning-based multi-view clustering framework for general data types (such as images and tables). Our approach involves generating meaningful fused representations using a novel permutation-based canonical correlation objective. We provide a theoretical analysis showing how the learned embeddings approximate those obtained by supervised linear discriminant analysis (LDA). Cluster assignments are learned by identifying consistent pseudo-labels across multiple views. Additionally, we establish a theoretical bound on the error caused by incorrect pseudo-labels in the unsupervised representations compared to LDA. Extensive experiments on ten multi-view clustering benchmark datasets provide empirical evidence for the effectiveness of the proposed model.
