Fine-Grained Evaluation for Implicit Discourse Relation Recognition
Xinyi Cai
TL;DR
Implicit discourse relation recognition remains challenging due to the absence of explicit connectives. This paper provides a fine-grained diagnostic of state-of-the-art pre-trained language models on PDTB 3.0 level-2 senses, augmented by a semi-manual data-annotation pipeline to bolster scarce categories. It reveals data distribution effects, identifies intrinsically difficult senses, and analyzes linguistic cues and cross-level inconsistencies between level-1 and level-2 predictions. The findings support targeted data augmentation and hierarchical modeling as promising directions to advance fine-grained implicit relation recognition.
Abstract
Implicit discourse relation recognition is a challenging task in discourse analysis due to the absence of explicit discourse connectives between spans of text. Recent pre-trained language models have achieved great success on this task. However, there is no fine-grained analysis of the performance of these pre-trained language models for this task. Therefore, the difficulty and possible directions of this task is unclear. In this paper, we deeply analyze the model prediction, attempting to find out the difficulty for the pre-trained language models and the possible directions of this task. In addition to having an in-depth analysis for this task by using pre-trained language models, we semi-manually annotate data to add relatively high-quality data for the relations with few annotated examples in PDTB 3.0. The annotated data significantly help improve implicit discourse relation recognition for level-2 senses.
