Cross-view Joint Learning for Mixed-Missing Multi-view Unsupervised Feature Selection
Zongxin Shen, Yanyong Huang, Dongjie Wang, Jinyuan Chang, Fengmao Lv, Tianrui Li, Xiaoyi Jiang
TL;DR
This work tackles mixed-missing multi-view unsupervised feature selection by proposing CLIM-FS, a joint learning framework that integrates missing-view and missing-variable imputation with feature selection through a nonnegative orthogonal matrix factorization model. It leverages cross-view consensus via a shared cluster indicator $\bm{F}^{*}$ and cross-view local geometry via graphs $\bm{S}^{v}$ and $\bm{H}$, jointly guiding imputation and discriminative feature selection. The authors provide theoretical results showing the imputation preserves intra/inter-cluster structure and cross-view geometry, along with convergence guarantees for the optimization algorithm, and demonstrate superior performance on eight real-world datasets against strong baselines. Overall, CLIM-FS advances MUFS toward realistic mixed-missing settings and offers a principled, scalable approach for robust feature selection in heterogeneous multi-view data.
Abstract
Incomplete multi-view unsupervised feature selection (IMUFS), which aims to identify representative features from unlabeled multi-view data containing missing values, has received growing attention in recent years. Despite their promising performance, existing methods face three key challenges: 1) by focusing solely on the view-missing problem, they are not well-suited to the more prevalent mixed-missing scenario in practice, where some samples lack entire views or only partial features within views; 2) insufficient utilization of consistency and diversity across views limits the effectiveness of feature selection; and 3) the lack of theoretical analysis makes it unclear how feature selection and data imputation interact during the joint learning process. Being aware of these, we propose CLIM-FS, a novel IMUFS method designed to address the mixed-missing problem. Specifically, we integrate the imputation of both missing views and variables into a feature selection model based on nonnegative orthogonal matrix factorization, enabling the joint learning of feature selection and adaptive data imputation. Furthermore, we fully leverage consensus cluster structure and cross-view local geometrical structure to enhance the synergistic learning process. We also provide a theoretical analysis to clarify the underlying collaborative mechanism of CLIM-FS. Experimental results on eight real-world multi-view datasets demonstrate that CLIM-FS outperforms state-of-the-art methods.
