Table of Contents
Fetching ...

Label Dependencies-aware Set Prediction Networks for Multi-label Text Classification

Du Xinkai, Han Quanjie, Sun Yalin, Lv Chao, Sun Maosong

TL;DR

This paper addresses multi-label text classification by treating the label set as an unordered prediction task. It introduces Label Dependencies-aware Set Prediction Networks (LD-SPN), which integrate a BERT-based set-predictor with a graph convolutional network that models label dependencies derived from co-occurrence, and a Bhattacharyya-distance-based regularizer to diversify output distributions and boost recall. A bipartite matching loss aligns predictions with ground truth in a permutation-invariant manner, and the non-autoregressive decoder enables parallel label generation for efficiency. Experimental results on MixSNIPS and AAPD demonstrate superior F1 scores and improved recall, with ablations confirming the importance of label dependencies and output diversity. Overall, LD-SPN offers a principled, efficient approach to leveraging label correlations in MLTC with tangible gains in both precision and recall.

Abstract

Multi-label text classification involves extracting all relevant labels from a sentence. Given the unordered nature of these labels, we propose approaching the problem as a set prediction task. To address the correlation between labels, we leverage Graph Convolutional Networks and construct an adjacency matrix based on the statistical relations between labels. Additionally, we enhance recall ability by applying the Bhattacharyya distance to the output distributions of the set prediction networks. We evaluate the effectiveness of our approach on two multi-label datasets and demonstrate its superiority over previous baselines through experimental results.

Label Dependencies-aware Set Prediction Networks for Multi-label Text Classification

TL;DR

This paper addresses multi-label text classification by treating the label set as an unordered prediction task. It introduces Label Dependencies-aware Set Prediction Networks (LD-SPN), which integrate a BERT-based set-predictor with a graph convolutional network that models label dependencies derived from co-occurrence, and a Bhattacharyya-distance-based regularizer to diversify output distributions and boost recall. A bipartite matching loss aligns predictions with ground truth in a permutation-invariant manner, and the non-autoregressive decoder enables parallel label generation for efficiency. Experimental results on MixSNIPS and AAPD demonstrate superior F1 scores and improved recall, with ablations confirming the importance of label dependencies and output diversity. Overall, LD-SPN offers a principled, efficient approach to leveraging label correlations in MLTC with tangible gains in both precision and recall.

Abstract

Multi-label text classification involves extracting all relevant labels from a sentence. Given the unordered nature of these labels, we propose approaching the problem as a set prediction task. To address the correlation between labels, we leverage Graph Convolutional Networks and construct an adjacency matrix based on the statistical relations between labels. Additionally, we enhance recall ability by applying the Bhattacharyya distance to the output distributions of the set prediction networks. We evaluate the effectiveness of our approach on two multi-label datasets and demonstrate its superiority over previous baselines through experimental results.
Paper Structure (13 sections, 11 equations, 1 figure, 4 tables)

This paper contains 13 sections, 11 equations, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Overall architecture of our LD-SPN model. The set prediction networks predict the multi-labels simultaneously by combining a BERT encoder for sentence representation and the label dependencies learned by GCN with a non-augoregressive decoder. Bhattacharyya distance is imposed on the output distribution of the set prediction networks and bipartite matching loss between the ground truth and predictions is optimized to obtain the predicted labels.