Table of Contents
Fetching ...

DAHRS: Divergence-Aware Hallucination-Remediated SRL Projection

Sangpil Youm, Brodie Mather, Chathuri Jayaweera, Juliana Prada, Bonnie Dorr

TL;DR

DAHRS addresses the challenge of projecting semantic roles across languages amid divergence-induced hallucinations from alignment models. It introduces a divergence-aware pipeline with token- and phrase-level alignment remediation followed by a First-Come First-Assign projection, enabling accurate, explainable cross-language SRL transfer without extra transformer-based modules. Empirical results on EN-FR and EN-ES show substantial F1 gains at both word- and phrasal-level SRL, and human judgments corroborate the phrasal outputs. The approach also serves as a diagnostic tool for dataset quality (e.g., CoNLL-2009 predicate labeling) and generalizes to low-resource language pairs through a divergence metric framework. This work advances multilingual SRL by combining linguistic insight with lightweight, transparent projection.

Abstract

Semantic role labeling (SRL) enriches many downstream applications, e.g., machine translation, question answering, summarization, and stance/belief detection. However, building multilingual SRL models is challenging due to the scarcity of semantically annotated corpora for multiple languages. Moreover, state-of-the-art SRL projection (XSRL) based on large language models (LLMs) yields output that is riddled with spurious role labels. Remediation of such hallucinations is not straightforward due to the lack of explainability of LLMs. We show that hallucinated role labels are related to naturally occurring divergence types that interfere with initial alignments. We implement Divergence-Aware Hallucination-Remediated SRL projection (DAHRS), leveraging linguistically-informed alignment remediation followed by greedy First-Come First-Assign (FCFA) SRL projection. DAHRS improves the accuracy of SRL projection without additional transformer-based machinery, beating XSRL in both human and automatic comparisons, and advancing beyond headwords to accommodate phrase-level SRL projection (e.g., EN-FR, EN-ES). Using CoNLL-2009 as our ground truth, we achieve a higher word-level F1 over XSRL: 87.6% vs. 77.3% (EN-FR) and 89.0% vs. 82.7% (EN-ES). Human phrase-level assessments yield 89.1% (EN-FR) and 91.0% (EN-ES). We also define a divergence metric to adapt our approach to other language pairs (e.g., English-Tagalog).

DAHRS: Divergence-Aware Hallucination-Remediated SRL Projection

TL;DR

DAHRS addresses the challenge of projecting semantic roles across languages amid divergence-induced hallucinations from alignment models. It introduces a divergence-aware pipeline with token- and phrase-level alignment remediation followed by a First-Come First-Assign projection, enabling accurate, explainable cross-language SRL transfer without extra transformer-based modules. Empirical results on EN-FR and EN-ES show substantial F1 gains at both word- and phrasal-level SRL, and human judgments corroborate the phrasal outputs. The approach also serves as a diagnostic tool for dataset quality (e.g., CoNLL-2009 predicate labeling) and generalizes to low-resource language pairs through a divergence metric framework. This work advances multilingual SRL by combining linguistic insight with lightweight, transparent projection.

Abstract

Semantic role labeling (SRL) enriches many downstream applications, e.g., machine translation, question answering, summarization, and stance/belief detection. However, building multilingual SRL models is challenging due to the scarcity of semantically annotated corpora for multiple languages. Moreover, state-of-the-art SRL projection (XSRL) based on large language models (LLMs) yields output that is riddled with spurious role labels. Remediation of such hallucinations is not straightforward due to the lack of explainability of LLMs. We show that hallucinated role labels are related to naturally occurring divergence types that interfere with initial alignments. We implement Divergence-Aware Hallucination-Remediated SRL projection (DAHRS), leveraging linguistically-informed alignment remediation followed by greedy First-Come First-Assign (FCFA) SRL projection. DAHRS improves the accuracy of SRL projection without additional transformer-based machinery, beating XSRL in both human and automatic comparisons, and advancing beyond headwords to accommodate phrase-level SRL projection (e.g., EN-FR, EN-ES). Using CoNLL-2009 as our ground truth, we achieve a higher word-level F1 over XSRL: 87.6% vs. 77.3% (EN-FR) and 89.0% vs. 82.7% (EN-ES). Human phrase-level assessments yield 89.1% (EN-FR) and 91.0% (EN-ES). We also define a divergence metric to adapt our approach to other language pairs (e.g., English-Tagalog).
Paper Structure (14 sections, 9 figures, 1 table, 3 algorithms)

This paper contains 14 sections, 9 figures, 1 table, 3 algorithms.

Figures (9)

  • Figure 1: Divergence cases corresponding to two hallucination types: (a) Light Verbs introduce one-to-many/many-to-one divergences that impede XSRL transfer of semantic roles even when the initial alignment is correct, thus hallucinating a lack of roles on the target-language side; (b) Structural divergences introduce word/phrase order distinctions that result in extra, spuriously aligned terms, thus hallucinating incorrect roles.
  • Figure 2: Divergence-Aware Hallucination-Remediated SRL Projection (DAHRS) pipeline from English to French
  • Figure 3: Divergence-Aware Hallucination Remediated SRL Projection (DAHRS)
  • Figure 4: Three subcategories of divergences (token level): One-to-many, Many-to-one, and Ordering
  • Figure 5: One-to-many (yellow) and Many-to-one (green) phrase-level alignments
  • ...and 4 more figures