Table of Contents
Fetching ...

Fine-Grained Evaluation for Implicit Discourse Relation Recognition

Xinyi Cai

TL;DR

Implicit discourse relation recognition remains challenging due to the absence of explicit connectives. This paper provides a fine-grained diagnostic of state-of-the-art pre-trained language models on PDTB 3.0 level-2 senses, augmented by a semi-manual data-annotation pipeline to bolster scarce categories. It reveals data distribution effects, identifies intrinsically difficult senses, and analyzes linguistic cues and cross-level inconsistencies between level-1 and level-2 predictions. The findings support targeted data augmentation and hierarchical modeling as promising directions to advance fine-grained implicit relation recognition.

Abstract

Implicit discourse relation recognition is a challenging task in discourse analysis due to the absence of explicit discourse connectives between spans of text. Recent pre-trained language models have achieved great success on this task. However, there is no fine-grained analysis of the performance of these pre-trained language models for this task. Therefore, the difficulty and possible directions of this task is unclear. In this paper, we deeply analyze the model prediction, attempting to find out the difficulty for the pre-trained language models and the possible directions of this task. In addition to having an in-depth analysis for this task by using pre-trained language models, we semi-manually annotate data to add relatively high-quality data for the relations with few annotated examples in PDTB 3.0. The annotated data significantly help improve implicit discourse relation recognition for level-2 senses.

Fine-Grained Evaluation for Implicit Discourse Relation Recognition

TL;DR

Implicit discourse relation recognition remains challenging due to the absence of explicit connectives. This paper provides a fine-grained diagnostic of state-of-the-art pre-trained language models on PDTB 3.0 level-2 senses, augmented by a semi-manual data-annotation pipeline to bolster scarce categories. It reveals data distribution effects, identifies intrinsically difficult senses, and analyzes linguistic cues and cross-level inconsistencies between level-1 and level-2 predictions. The findings support targeted data augmentation and hierarchical modeling as promising directions to advance fine-grained implicit relation recognition.

Abstract

Implicit discourse relation recognition is a challenging task in discourse analysis due to the absence of explicit discourse connectives between spans of text. Recent pre-trained language models have achieved great success on this task. However, there is no fine-grained analysis of the performance of these pre-trained language models for this task. Therefore, the difficulty and possible directions of this task is unclear. In this paper, we deeply analyze the model prediction, attempting to find out the difficulty for the pre-trained language models and the possible directions of this task. In addition to having an in-depth analysis for this task by using pre-trained language models, we semi-manually annotate data to add relatively high-quality data for the relations with few annotated examples in PDTB 3.0. The annotated data significantly help improve implicit discourse relation recognition for level-2 senses.

Paper Structure

This paper contains 12 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Confusion matrice for level-2 senses on PDTB 3.0.
  • Figure 2: Inconsistent model predictions across sense levels. The color from left to right indicate that the predictions are: correct for level-1 senses but wrong for level-2 senses; correct for level-2 senses but wrong for level-1 senses; correct for both level-1 and level-2 senses; wrong for both level-1 and level-2 senses.