Table of Contents
Fetching ...

Where did you get that? Towards Summarization Attribution for Analysts

Violet B, John M. Conroy, Sean Lynch, Danielle M, Neil P. Molino, Aaron Wiechmann, Julia S. Yang

TL;DR

This work tackles attribution in analyst-focused automatic summaries by linking each summary sentence to supporting source passages and evaluating a hybrid summarization pipeline (OCCAMS extractive plus GPT paraphrase) against a purely abstractive GPT approach across CrisisFACTS, Cyber Threat Intelligence, and TAC 2011 datasets. It systematically compares attribution methods—NLI versus sentence embeddings—using human judgments and task-based evaluation (Task 1 and Task 2), finding that embedding-based attribution generally aligns better with humans and that the hybrid pipeline often improves attribution ease, albeit with dataset-dependent refutation patterns. The study introduces a refutation typology to categorize factual errors and demonstrates that parsing, time-shift, and related information issues influence attribution quality, with practical implications for trustworthy analyst-ready summaries. Overall, the results highlight the value of a hybrid extraction-plus-paraphrase approach and targeted attribution strategies for reducing hallucinations and improving traceability of automated summaries.

Abstract

Analysts require attribution, as nothing can be reported without knowing the source of the information. In this paper, we will focus on automatic methods for attribution, linking each sentence in the summary to a portion of the source text, which may be in one or more documents. We explore using a hybrid summarization, i.e., an automatic paraphrase of an extractive summary, to ease attribution. We also use a custom topology to identify the proportion of different categories of attribution-related errors.

Where did you get that? Towards Summarization Attribution for Analysts

TL;DR

This work tackles attribution in analyst-focused automatic summaries by linking each summary sentence to supporting source passages and evaluating a hybrid summarization pipeline (OCCAMS extractive plus GPT paraphrase) against a purely abstractive GPT approach across CrisisFACTS, Cyber Threat Intelligence, and TAC 2011 datasets. It systematically compares attribution methods—NLI versus sentence embeddings—using human judgments and task-based evaluation (Task 1 and Task 2), finding that embedding-based attribution generally aligns better with humans and that the hybrid pipeline often improves attribution ease, albeit with dataset-dependent refutation patterns. The study introduces a refutation typology to categorize factual errors and demonstrates that parsing, time-shift, and related information issues influence attribution quality, with practical implications for trustworthy analyst-ready summaries. Overall, the results highlight the value of a hybrid extraction-plus-paraphrase approach and targeted attribution strategies for reducing hallucinations and improving traceability of automated summaries.

Abstract

Analysts require attribution, as nothing can be reported without knowing the source of the information. In this paper, we will focus on automatic methods for attribution, linking each sentence in the summary to a portion of the source text, which may be in one or more documents. We explore using a hybrid summarization, i.e., an automatic paraphrase of an extractive summary, to ease attribution. We also use a custom topology to identify the proportion of different categories of attribution-related errors.

Paper Structure

This paper contains 22 sections, 8 figures, 10 tables.

Figures (8)

  • Figure S1: Using embeddings in Task 1 gives stronger automatic attribution as judged by analysts compared to the NLI model.
  • Figure S2: Using embeddings in Task 2 gives stronger automatic attribution as judged by analysts compared to the NLI model.
  • Figure S3: Using Hybrid (occams/GPT) summaries on the TAC 2011 dataset gives rise to stronger automatic attribution as judged by analysts compared to using GPT alone.
  • Figure S4: Using Hybrid (occams/GPT) summaries on the Cyber dataset gives rise to stronger automatic attribution as judged by analysts compared to using GPT alone.
  • Figure S5: Using Hybrid (occams/GPT) summaries on the Cyber dataset gives rise to stronger automatic attribution as judged by cyber experts compared to using GPT alone.
  • ...and 3 more figures