Can summarization approximate simplification? A gold standard comparison
Giacomo Magnifico, Eduard Barbu
TL;DR
This paper investigates whether abstractive summarization can approximate manual simplification. It applies two BRIO-based summarization strategies to the Newsela corpus and evaluates the outputs against four levels of professionally produced simplifications using ROUGE-L. The key finding is that paragraph-by-paragraph summarization yields the strongest similarity (ROUGE-L up to 0.654 at simplification level 1) but does not substitute for manual simplification; it may nonetheless serve as a viable preprocessing baseline for simplification workflows. The work also highlights ROUGE-L's limitations for semantic similarity and proposes future directions with semantic-aware metrics like ROUGE-SEM or SARI and optimization for broader accessibility.
Abstract
This study explores the overlap between text summarization and simplification outputs. While summarization evaluation methods are streamlined, simplification lacks cohesion, prompting the question: how closely can abstractive summarization resemble gold-standard simplification? We address this by applying two BART-based BRIO summarization methods to the Newsela corpus, comparing outputs with manually annotated simplifications and achieving a top ROUGE-L score of 0.654. This provides insight into where summarization and simplification outputs converge and differ.
