Exploring a Principled Framework for Deep Subspace Clustering
Xianghan Meng, Zhiyuan Huang, Wei He, Xianbiao Qi, Rong Xiao, Chun-Guang Li
TL;DR
This work addresses the weakness of existing deep subspace clustering methods, which often suffer from feature collapse and lack theoretical guarantees to recover a union-of-subspaces structure. It introduces PRO-DSC, a principled framework that regularizes the learned representations with a log-determinant term, enabling simultaneous learning of structured representations and self-expressive coefficients. The authors prove eigenspace alignment between the representation Gram matrix and the self-expressive error, derive conditions to prevent collapse, and show that, under certain regimes, learned representations form a union of orthogonal subspaces with a block-diagonal self-expressive matrix. A scalable implementation via reparameterization, Sinkhorn-based coefficient learning, and differential programming supports large datasets and unseen samples. Extensive experiments on synthetic data and six real-world benchmarks, using CLIP and BYOL pre-trained features, demonstrate superior clustering performance and qualitative evidence of noncollapse and UoS structure, with code provided for reproducibility.
Abstract
Subspace clustering is a classical unsupervised learning task, built on a basic assumption that high-dimensional data can be approximated by a union of subspaces (UoS). Nevertheless, the real-world data are often deviating from the UoS assumption. To address this challenge, state-of-the-art deep subspace clustering algorithms attempt to jointly learn UoS representations and self-expressive coefficients. However, the general framework of the existing algorithms suffers from a catastrophic feature collapse and lacks a theoretical guarantee to learn desired UoS representation. In this paper, we present a Principled fRamewOrk for Deep Subspace Clustering (PRO-DSC), which is designed to learn structured representations and self-expressive coefficients in a unified manner. Specifically, in PRO-DSC, we incorporate an effective regularization on the learned representations into the self-expressive model, prove that the regularized self-expressive model is able to prevent feature space collapse, and demonstrate that the learned optimal representations under certain condition lie on a union of orthogonal subspaces. Moreover, we provide a scalable and efficient approach to implement our PRO-DSC and conduct extensive experiments to verify our theoretical findings and demonstrate the superior performance of our proposed deep subspace clustering approach. The code is available at https://github.com/mengxianghan123/PRO-DSC.
