Showing Many Labels in Multi-label Classification Models: An Empirical Study of Adversarial Examples
Yujiang Liu, Wenjian Luo, Zhijian Chen, Muhammad Luqman Naseem
TL;DR
This study addresses the vulnerability of multi-label deep networks to adversarial perturbations by introducing a novel attack objective, 'Showing Many Labels', which seeks to maximize the number of positive labels predicted for each instance. The authors evaluate nine attacks (eight adapted from multi-class settings and one multi-label-specific) against two models, ML-LIW and ML-GCN, across four datasets (VOC2007, VOC2012, NUS-WIDE, COCO) using the AdverTorch framework, across eight target label-count scenarios. They find that iterative attacks significantly outperform one-step methods, with some configurations enabling the attack to reveal all labels; performance depends on dataset, model, and attack, with MI-FGSM and ML-CW often strongest, and MLA-LP weaker in many settings. The results establish a baseline for multi-label adversarial research and highlight practical implications for robustness in systems that rely on multi-label predictions, motivating future work on defenses and more effective multi-label-specific attacks.
Abstract
With the rapid development of Deep Neural Networks (DNNs), they have been applied in numerous fields. However, research indicates that DNNs are susceptible to adversarial examples, and this is equally true in the multi-label domain. To further investigate multi-label adversarial examples, we introduce a novel type of attacks, termed "Showing Many Labels". The objective of this attack is to maximize the number of labels included in the classifier's prediction results. In our experiments, we select nine attack algorithms and evaluate their performance under "Showing Many Labels". Eight of the attack algorithms were adapted from the multi-class environment to the multi-label environment, while the remaining one was specifically designed for the multi-label environment. We choose ML-LIW and ML-GCN as target models and train them on four popular multi-label datasets: VOC2007, VOC2012, NUS-WIDE, and COCO. We record the success rate of each algorithm when it shows the expected number of labels in eight different scenarios. Experimental results indicate that under the "Showing Many Labels", iterative attacks perform significantly better than one-step attacks. Moreover, it is possible to show all labels in the dataset.
