Table of Contents
Fetching ...

Improving Zero-shot Sentence Decontextualisation with Content Selection and Planning

Zhenyun Deng, Yulong Chen, Andreas Vlachos

TL;DR

This work tackles the challenge of decontextualising sentences by preserving meaning when taken out of their original context. It introduces ECSP, an EDU-level content selection and planning framework that first segments a sentence and its context into Elementary Discourse Units (EDUs), identifies ambiguous EDUs through discourse-relational analysis, and then selects relevant EDUs to create a structured content plan. The plan guides a rewrite that enriches each ambiguous EDU with its discourse-relevant content, yielding decontextualised sentences with improved semantic integrity and discourse coherence. Across a standard benchmark and downstream tasks, ECSP achieves strong zero-shot performance, offers interpretable intermediate outputs (ambiguous EDUs and RelEDUs), and demonstrates tangible gains in multi-hop evidence retrieval and claim extraction, suggesting practical benefits for evidence-based reasoning systems.

Abstract

Extracting individual sentences from a document as evidence or reasoning steps is commonly done in many NLP tasks. However, extracted sentences often lack context necessary to make them understood, e.g., coreference and background information. To this end, we propose a content selection and planning framework for zero-shot decontextualisation, which determines what content should be mentioned and in what order for a sentence to be understood out of context. Specifically, given a potentially ambiguous sentence and its context, we first segment it into basic semantically-independent units. We then identify potentially ambiguous units from the given sentence, and extract relevant units from the context based on their discourse relations. Finally, we generate a content plan to rewrite the sentence by enriching each ambiguous unit with its relevant units. Experimental results demonstrate that our approach is competitive for sentence decontextualisation, producing sentences that exhibit better semantic integrity and discourse coherence, outperforming existing methods.

Improving Zero-shot Sentence Decontextualisation with Content Selection and Planning

TL;DR

This work tackles the challenge of decontextualising sentences by preserving meaning when taken out of their original context. It introduces ECSP, an EDU-level content selection and planning framework that first segments a sentence and its context into Elementary Discourse Units (EDUs), identifies ambiguous EDUs through discourse-relational analysis, and then selects relevant EDUs to create a structured content plan. The plan guides a rewrite that enriches each ambiguous EDU with its discourse-relevant content, yielding decontextualised sentences with improved semantic integrity and discourse coherence. Across a standard benchmark and downstream tasks, ECSP achieves strong zero-shot performance, offers interpretable intermediate outputs (ambiguous EDUs and RelEDUs), and demonstrates tangible gains in multi-hop evidence retrieval and claim extraction, suggesting practical benefits for evidence-based reasoning systems.

Abstract

Extracting individual sentences from a document as evidence or reasoning steps is commonly done in many NLP tasks. However, extracted sentences often lack context necessary to make them understood, e.g., coreference and background information. To this end, we propose a content selection and planning framework for zero-shot decontextualisation, which determines what content should be mentioned and in what order for a sentence to be understood out of context. Specifically, given a potentially ambiguous sentence and its context, we first segment it into basic semantically-independent units. We then identify potentially ambiguous units from the given sentence, and extract relevant units from the context based on their discourse relations. Finally, we generate a content plan to rewrite the sentence by enriching each ambiguous unit with its relevant units. Experimental results demonstrate that our approach is competitive for sentence decontextualisation, producing sentences that exhibit better semantic integrity and discourse coherence, outperforming existing methods.

Paper Structure

This paper contains 43 sections, 4 equations, 3 figures, 17 tables.

Figures (3)

  • Figure 1: An overview of our proposed EDU-level content selection and planning (ECSP) framework for decontextualisation. The sentence to decontextualise is highlighted in bold. ECSP consists of two modules: $i)$ Content selection: identifies ambiguous EDUs in the sentence and selects EDUs that have discourse relations with the sentence as context required for decontextualisation; $ii)$ Content Planning: rewrites the sentence to be understood out of context by sequentially enriching each ambiguous EDU with its discourse-relevant EDUs.
  • Figure 2: Case studies of our EDU decontextualisation. The sentences underlined are the ones to be decontextualised. The text spans ( i.e., EDU) in gray are ambiguous EDUs and in orange are relevant contextual EDUs.
  • Figure :