Table of Contents
Fetching ...

Deep Prompt Multi-task Network for Abuse Language Detection

Jian Zhu, Yuping Ruan, Jingfei Chang, Wenhui Sun, Hui Wan, Jian Long, Cheng Luo

TL;DR

A novel Deep Prompt Multi-task Network (DPMN) is proposed for abuse language detection that attempts to design two forms of deep prompt tuning and light prompt tuning for the pre-trained language models (PLMs) to handle downstream tasks.

Abstract

The detection of abusive language remains a long-standing challenge with the extensive use of social networks. The detection task of abusive language suffers from limited accuracy. We argue that the existing detection methods utilize the fine-tuning technique of the pre-trained language models (PLMs) to handle downstream tasks. Hence, these methods fail to stimulate the general knowledge of the PLMs. To address the problem, we propose a novel Deep Prompt Multi-task Network (DPMN) for abuse language detection. Specifically, DPMN first attempts to design two forms of deep prompt tuning and light prompt tuning for the PLMs. The effects of different prompt lengths, tuning strategies, and prompt initialization methods on detecting abusive language are studied. In addition, we propose a Task Head based on Bi-LSTM and FFN, which can be used as a short text classifier. Eventually, DPMN utilizes multi-task learning to improve detection metrics further. The multi-task network has the function of transferring effective knowledge. The proposed DPMN is evaluated against eight typical methods on three public datasets: OLID, SOLID, and AbuseAnalyzer. The experimental results show that our DPMN outperforms the state-of-the-art methods.

Deep Prompt Multi-task Network for Abuse Language Detection

TL;DR

A novel Deep Prompt Multi-task Network (DPMN) is proposed for abuse language detection that attempts to design two forms of deep prompt tuning and light prompt tuning for the pre-trained language models (PLMs) to handle downstream tasks.

Abstract

The detection of abusive language remains a long-standing challenge with the extensive use of social networks. The detection task of abusive language suffers from limited accuracy. We argue that the existing detection methods utilize the fine-tuning technique of the pre-trained language models (PLMs) to handle downstream tasks. Hence, these methods fail to stimulate the general knowledge of the PLMs. To address the problem, we propose a novel Deep Prompt Multi-task Network (DPMN) for abuse language detection. Specifically, DPMN first attempts to design two forms of deep prompt tuning and light prompt tuning for the PLMs. The effects of different prompt lengths, tuning strategies, and prompt initialization methods on detecting abusive language are studied. In addition, we propose a Task Head based on Bi-LSTM and FFN, which can be used as a short text classifier. Eventually, DPMN utilizes multi-task learning to improve detection metrics further. The multi-task network has the function of transferring effective knowledge. The proposed DPMN is evaluated against eight typical methods on three public datasets: OLID, SOLID, and AbuseAnalyzer. The experimental results show that our DPMN outperforms the state-of-the-art methods.
Paper Structure (20 sections, 5 equations, 5 figures, 3 tables)

This paper contains 20 sections, 5 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: An example of applying prompt-based learning to identify whether a tweet is an abuse language. The mask character is a word to be predicted. It can have two-word choices in the verbalizer, abuse and non-abuse.
  • Figure 2: Our DPMN Architecture. According to the number of continuous prompt tokens, initialization strategy of continuous prompt, prompt form, and tuning strategy, the prompt encoder module generates the learnable embedding. The tokenizer encoder module encodes the short text to generate input embedding. We produce a representation matrix by combining the learnable embedding and input embedding. The representation matrix is input into PLMs. It outputs the shared embedding. The shared embedding is input to the task heads. The task heads output the probability value of the prediction classification. The total loss function is calculated to train the entire DPMN architecture.
  • Figure 3: We design two continuous prompt forms, namely deep prompt tuning and light prompt tuning. The deep prompt tuning is to add trainable continuous prompt embedding to each layer of the PLMs. The light prompt tuning is to add trainable continuous prompt embedding to the first layer of the PLMs.
  • Figure 4: Through ablation experiments on the OLID dataset, the Macro F1 score contributions of different neural network modules of the DPMN algorithm are determined. A histogram of the structural lifting value is drawn in the network, which contains the Bi-LSTM + FFN module, the multi-task learning (MTL) module, and the prompt-based learning module.
  • Figure 5: The convergence of the DPMN is verified on the OLID dataset.