Table of Contents
Fetching ...

Task-Augmented Cross-View Imputation Network for Partial Multi-View Incomplete Multi-Label Classification

Lian Zhao, Jie Wen, Xiaohuan Lu, Wai Keung Wong, Jiang Long, Wulin Xie

TL;DR

This work tackles partial multi-view incomplete multi-label classification (PMVIMLC) by introducing TACVI-Net, a two-stage framework that first extracts task-relevant features per view using a information-bottleneck–driven, variational encoder, and then performs cross-view imputation to recover missing views. The second stage uses deep autoencoder–based cross-view reconstruction to obtain complete multi-view representations, enabling a more accurate multi-label prediction with a weighted fusion of views and a missing-label–aware loss. Empirical results on five datasets demonstrate that TACVI-Net outperforms state-of-the-art methods, with ablation studies confirming the contribution of each component (task-oriented augmentation and cross-view imputation) to robustness under substantial view and label missingness. The approach advances practical MVMLC by effectively leveraging cross-view consistency and reducing noise from task-irrelevant information, offering improved classification performance in real-world incomplete data scenarios.

Abstract

In real-world scenarios, multi-view multi-label learning often encounters the challenge of incomplete training data due to limitations in data collection and unreliable annotation processes. The absence of multi-view features impairs the comprehensive understanding of samples, omitting crucial details essential for classification. To address this issue, we present a task-augmented cross-view imputation network (TACVI-Net) for the purpose of handling partial multi-view incomplete multi-label classification. Specifically, we employ a two-stage network to derive highly task-relevant features to recover the missing views. In the first stage, we leverage the information bottleneck theory to obtain a discriminative representation of each view by extracting task-relevant information through a view-specific encoder-classifier architecture. In the second stage, an autoencoder based multi-view reconstruction network is utilized to extract high-level semantic representation of the augmented features and recover the missing data, thereby aiding the final classification task. Extensive experiments on five datasets demonstrate that our TACVI-Net outperforms other state-of-the-art methods.

Task-Augmented Cross-View Imputation Network for Partial Multi-View Incomplete Multi-Label Classification

TL;DR

This work tackles partial multi-view incomplete multi-label classification (PMVIMLC) by introducing TACVI-Net, a two-stage framework that first extracts task-relevant features per view using a information-bottleneck–driven, variational encoder, and then performs cross-view imputation to recover missing views. The second stage uses deep autoencoder–based cross-view reconstruction to obtain complete multi-view representations, enabling a more accurate multi-label prediction with a weighted fusion of views and a missing-label–aware loss. Empirical results on five datasets demonstrate that TACVI-Net outperforms state-of-the-art methods, with ablation studies confirming the contribution of each component (task-oriented augmentation and cross-view imputation) to robustness under substantial view and label missingness. The approach advances practical MVMLC by effectively leveraging cross-view consistency and reducing noise from task-irrelevant information, offering improved classification performance in real-world incomplete data scenarios.

Abstract

In real-world scenarios, multi-view multi-label learning often encounters the challenge of incomplete training data due to limitations in data collection and unreliable annotation processes. The absence of multi-view features impairs the comprehensive understanding of samples, omitting crucial details essential for classification. To address this issue, we present a task-augmented cross-view imputation network (TACVI-Net) for the purpose of handling partial multi-view incomplete multi-label classification. Specifically, we employ a two-stage network to derive highly task-relevant features to recover the missing views. In the first stage, we leverage the information bottleneck theory to obtain a discriminative representation of each view by extracting task-relevant information through a view-specific encoder-classifier architecture. In the second stage, an autoencoder based multi-view reconstruction network is utilized to extract high-level semantic representation of the augmented features and recover the missing data, thereby aiding the final classification task. Extensive experiments on five datasets demonstrate that our TACVI-Net outperforms other state-of-the-art methods.
Paper Structure (23 sections, 14 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 23 sections, 14 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: The sketch of our TACVI-Net. The TAEncoder refers to a task-augmented encoder based on variational encoding, while the VSEncoder and VSDecoder denote view-specific encoders and decoders that are designed with a symmetrical structure relative to each other, both of which are based on MLPs.
  • Figure 2: The performance outcomes for Corel5k are depicted under two scenarios: (a) various proportions of missing views accompanied by a 50% rate of missing labels, and (b) a steady 50% rate of missing views coupled with differing levels of missing labels.
  • Figure 3: More experimental results for all datasets with different view-missing ratios and different label-missing ratios.
  • Figure 4: The performance outcomes of different training sample ratios on (a) Corel5k dataset and (b) Pascal07 dataset with 50% missing-view ratio and 50% missing-label ratio.
  • Figure 5: The hyper-parameter sensitivity experiments on (a) Corkl5k dataset and (b) Pascal07 dataset with 50% missing instances, 50% missing labels, and 70% training samples.