Explicating the Implicit: Argument Detection Beyond Sentence Boundaries
Paul Roit, Aviv Slobodkin, Eran Hirsch, Arie Cattan, Ayal Klein, Valentina Pyatkin, Ido Dagan
TL;DR
This work reframes document-level semantic argument detection as a textual-entailment task, enabling cross-sentence arguments to be identified without extensive domain-specific supervision. It constructs simple semantic hypotheses from a predicate’s in-sentence arguments and candidate cross-sentence phrases, then tests entailment against the full document to select valid arguments. A predicate-argument aware NLI model trained on QA-SRL data demonstrates strong performance, often surpassing task-specific supervised baselines on document-level benchmarks, with LLM prompts further informing the safety and limits of zero-shot approaches. The method yields schema-free, easily downstream-processed propositions that reveal cross-sentence semantics and can augment SRL and event extraction pipelines, albeit at a computational cost and with dependence on robust entailment models.
Abstract
Detecting semantic arguments of a predicate word has been conventionally modeled as a sentence-level task. The typical reader, however, perfectly interprets predicate-argument relations in a much wider context than just the sentence where the predicate was evoked. In this work, we reformulate the problem of argument detection through textual entailment to capture semantic relations across sentence boundaries. We propose a method that tests whether some semantic relation can be inferred from a full passage by first encoding it into a simple and standalone proposition and then testing for entailment against the passage. Our method does not require direct supervision, which is generally absent due to dataset scarcity, but instead builds on existing NLI and sentence-level SRL resources. Such a method can potentially explicate pragmatically understood relations into a set of explicit sentences. We demonstrate it on a recent document-level benchmark, outperforming some supervised methods and contemporary language models.
