Label Dependencies-aware Set Prediction Networks for Multi-label Text Classification
Du Xinkai, Han Quanjie, Sun Yalin, Lv Chao, Sun Maosong
TL;DR
This paper addresses multi-label text classification by treating the label set as an unordered prediction task. It introduces Label Dependencies-aware Set Prediction Networks (LD-SPN), which integrate a BERT-based set-predictor with a graph convolutional network that models label dependencies derived from co-occurrence, and a Bhattacharyya-distance-based regularizer to diversify output distributions and boost recall. A bipartite matching loss aligns predictions with ground truth in a permutation-invariant manner, and the non-autoregressive decoder enables parallel label generation for efficiency. Experimental results on MixSNIPS and AAPD demonstrate superior F1 scores and improved recall, with ablations confirming the importance of label dependencies and output diversity. Overall, LD-SPN offers a principled, efficient approach to leveraging label correlations in MLTC with tangible gains in both precision and recall.
Abstract
Multi-label text classification involves extracting all relevant labels from a sentence. Given the unordered nature of these labels, we propose approaching the problem as a set prediction task. To address the correlation between labels, we leverage Graph Convolutional Networks and construct an adjacency matrix based on the statistical relations between labels. Additionally, we enhance recall ability by applying the Bhattacharyya distance to the output distributions of the set prediction networks. We evaluate the effectiveness of our approach on two multi-label datasets and demonstrate its superiority over previous baselines through experimental results.
