Table of Contents
Fetching ...

KeNet:Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification

Bo Li, Yuyan Chen, Liang Zeng

TL;DR

A novel approach known as Knowledge-enhanced Doc-Label Attention Network (KeNet), which designs an Attention Network that incorporates external knowledge, label embedding, and a comprehensive attention mechanism to predict all labels for each single text.

Abstract

Multi-Label Text Classification (MLTC) is a fundamental task in the field of Natural Language Processing (NLP) that involves the assignment of multiple labels to a given text. MLTC has gained significant importance and has been widely applied in various domains such as topic recognition, recommendation systems, sentiment analysis, and information retrieval. However, traditional machine learning and Deep neural network have not yet addressed certain issues, such as the fact that some documents are brief but have a large number of labels and how to establish relationships between the labels. It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC. To address this issue, we provide a novel approach known as Knowledge-enhanced Doc-Label Attention Network (KeNet). Specifically, we design an Attention Network that incorporates external knowledge, label embedding, and a comprehensive attention mechanism. In contrast to conventional methods, we use comprehensive representation of documents, knowledge and labels to predict all labels for each single text. Our approach has been validated by comprehensive research conducted on three multi-label datasets. Experimental results demonstrate that our method outperforms state-of-the-art MLTC method. Additionally, a case study is undertaken to illustrate the practical implementation of KeNet.

KeNet:Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification

TL;DR

A novel approach known as Knowledge-enhanced Doc-Label Attention Network (KeNet), which designs an Attention Network that incorporates external knowledge, label embedding, and a comprehensive attention mechanism to predict all labels for each single text.

Abstract

Multi-Label Text Classification (MLTC) is a fundamental task in the field of Natural Language Processing (NLP) that involves the assignment of multiple labels to a given text. MLTC has gained significant importance and has been widely applied in various domains such as topic recognition, recommendation systems, sentiment analysis, and information retrieval. However, traditional machine learning and Deep neural network have not yet addressed certain issues, such as the fact that some documents are brief but have a large number of labels and how to establish relationships between the labels. It is imperative to additionally acknowledge that the significance of knowledge is substantiated in the realm of MLTC. To address this issue, we provide a novel approach known as Knowledge-enhanced Doc-Label Attention Network (KeNet). Specifically, we design an Attention Network that incorporates external knowledge, label embedding, and a comprehensive attention mechanism. In contrast to conventional methods, we use comprehensive representation of documents, knowledge and labels to predict all labels for each single text. Our approach has been validated by comprehensive research conducted on three multi-label datasets. Experimental results demonstrate that our method outperforms state-of-the-art MLTC method. Additionally, a case study is undertaken to illustrate the practical implementation of KeNet.
Paper Structure (15 sections, 7 equations, 5 figures, 3 tables)

This paper contains 15 sections, 7 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The example of Multi-label Text Classification.
  • Figure 2: The architecture of the proposed KeNet model.
  • Figure 3: Influence of dimension of hidden state, and document and knowledge length on the RV1-V2 dataset.
  • Figure 4: Visual analysis of KeNet on a MLTC task with label $cs.sy$ (a) and $math.oc$ (b).
  • Figure 5: Weights of all labels of a given document.