Table of Contents
Fetching ...

Multi-Label Classification for Implicit Discourse Relation Recognition

Wanqiu Long, N. Siddharth, Bonnie Webber

TL;DR

This work reframes implicit discourse relation recognition (IDRR) as a multi-label classification problem using PDTB-3, arguing that single-label framing misses interdependent discourse senses. It compares three modeling paradigms—a RoBERTa-based single-head classifier, per-label binary heads, and an encoder-decoder sequence model—under 12-fold section-level cross-validation, showing multi-label methods can outperform traditional single-label systems even when evaluated with single-label metrics. Method 2 (per-label binary heads) generally yields the strongest macro-F1, while labels such as 'Asynchronous' and 'Equivalence' remain challenging; the authors also find that focal loss can improve performance on imbalanced labels and that example-level CV can stabilize results. They provide a fine-grained analysis of label correlations and mispredictions, discuss dataset and methodological limitations, and propose avenues for richer multi-label discourse datasets and more advanced multi-label architectures. Overall, the study supports broader adoption and further development of multi-label approaches for IDRR with practical implications for downstream NLP tasks.

Abstract

Discourse relations play a pivotal role in establishing coherence within textual content, uniting sentences and clauses into a cohesive narrative. The Penn Discourse Treebank (PDTB) stands as one of the most extensively utilized datasets in this domain. In PDTB-3, the annotators can assign multiple labels to an example, when they believe that multiple relations are present. Prior research in discourse relation recognition has treated these instances as separate examples during training, and only one example needs to have its label predicted correctly for the instance to be judged as correct. However, this approach is inadequate, as it fails to account for the interdependence of labels in real-world contexts and to distinguish between cases where only one sense relation holds and cases where multiple relations hold simultaneously. In our work, we address this challenge by exploring various multi-label classification frameworks to handle implicit discourse relation recognition. We show that multi-label classification methods don't depress performance for single-label prediction. Additionally, we give comprehensive analysis of results and data. Our work contributes to advancing the understanding and application of discourse relations and provide a foundation for the future study

Multi-Label Classification for Implicit Discourse Relation Recognition

TL;DR

This work reframes implicit discourse relation recognition (IDRR) as a multi-label classification problem using PDTB-3, arguing that single-label framing misses interdependent discourse senses. It compares three modeling paradigms—a RoBERTa-based single-head classifier, per-label binary heads, and an encoder-decoder sequence model—under 12-fold section-level cross-validation, showing multi-label methods can outperform traditional single-label systems even when evaluated with single-label metrics. Method 2 (per-label binary heads) generally yields the strongest macro-F1, while labels such as 'Asynchronous' and 'Equivalence' remain challenging; the authors also find that focal loss can improve performance on imbalanced labels and that example-level CV can stabilize results. They provide a fine-grained analysis of label correlations and mispredictions, discuss dataset and methodological limitations, and propose avenues for richer multi-label discourse datasets and more advanced multi-label architectures. Overall, the study supports broader adoption and further development of multi-label approaches for IDRR with practical implications for downstream NLP tasks.

Abstract

Discourse relations play a pivotal role in establishing coherence within textual content, uniting sentences and clauses into a cohesive narrative. The Penn Discourse Treebank (PDTB) stands as one of the most extensively utilized datasets in this domain. In PDTB-3, the annotators can assign multiple labels to an example, when they believe that multiple relations are present. Prior research in discourse relation recognition has treated these instances as separate examples during training, and only one example needs to have its label predicted correctly for the instance to be judged as correct. However, this approach is inadequate, as it fails to account for the interdependence of labels in real-world contexts and to distinguish between cases where only one sense relation holds and cases where multiple relations hold simultaneously. In our work, we address this challenge by exploring various multi-label classification frameworks to handle implicit discourse relation recognition. We show that multi-label classification methods don't depress performance for single-label prediction. Additionally, we give comprehensive analysis of results and data. Our work contributes to advancing the understanding and application of discourse relations and provide a foundation for the future study
Paper Structure (27 sections, 3 equations, 3 figures, 12 tables)

This paper contains 27 sections, 3 equations, 3 figures, 12 tables.

Figures (3)

  • Figure 1: Co-occurrence of label pairs in the dataset and in the prediction. The upper sub-figure is for the gold label pairs, while the lower is for the predicted pairs.
  • Figure 2: Heatmap of underpredicted multi-label instances. This figure displays the distribution of instances where two labels are annotated but only one is predicted.
  • Figure 3: Heatmap of overpredicted single label instances. This figure displays the distribution of instances where single-label is annotated but are given two labels by the model.