Task-Augmented Cross-View Imputation Network for Partial Multi-View Incomplete Multi-Label Classification
Lian Zhao, Jie Wen, Xiaohuan Lu, Wai Keung Wong, Jiang Long, Wulin Xie
TL;DR
This work tackles partial multi-view incomplete multi-label classification (PMVIMLC) by introducing TACVI-Net, a two-stage framework that first extracts task-relevant features per view using a information-bottleneck–driven, variational encoder, and then performs cross-view imputation to recover missing views. The second stage uses deep autoencoder–based cross-view reconstruction to obtain complete multi-view representations, enabling a more accurate multi-label prediction with a weighted fusion of views and a missing-label–aware loss. Empirical results on five datasets demonstrate that TACVI-Net outperforms state-of-the-art methods, with ablation studies confirming the contribution of each component (task-oriented augmentation and cross-view imputation) to robustness under substantial view and label missingness. The approach advances practical MVMLC by effectively leveraging cross-view consistency and reducing noise from task-irrelevant information, offering improved classification performance in real-world incomplete data scenarios.
Abstract
In real-world scenarios, multi-view multi-label learning often encounters the challenge of incomplete training data due to limitations in data collection and unreliable annotation processes. The absence of multi-view features impairs the comprehensive understanding of samples, omitting crucial details essential for classification. To address this issue, we present a task-augmented cross-view imputation network (TACVI-Net) for the purpose of handling partial multi-view incomplete multi-label classification. Specifically, we employ a two-stage network to derive highly task-relevant features to recover the missing views. In the first stage, we leverage the information bottleneck theory to obtain a discriminative representation of each view by extracting task-relevant information through a view-specific encoder-classifier architecture. In the second stage, an autoencoder based multi-view reconstruction network is utilized to extract high-level semantic representation of the augmented features and recover the missing data, thereby aiding the final classification task. Extensive experiments on five datasets demonstrate that our TACVI-Net outperforms other state-of-the-art methods.
