Table of Contents
Fetching ...

A Decoding Algorithm for Length-Control Summarization Based on Directed Acyclic Transformers

Chenyang Huang, Hao Zhou, Cameron Jen, Kangjie Zheng, Osmar R. Zaïane, Lili Mou

TL;DR

The paper tackles exact length-controlled summarization by adapting the Directed Acyclic Transformer (DAT) to generate multiple plausible fragments across an expanded canvas of $S$ steps and connect them via a path. It introduces SeqMAP, a decoding objective that marginalizes over all possible linking paths to find the most probable summary under a length budget, implemented via a beam-search dynamic programming algorithm with Expand and Merge steps, plus an optional RoBERTa-based reranker. Empirical results on Gigaword and DUC2004 show SeqMAP and PathMAP surpass CTC-based baselines, with SeqMAP achieving superior ROUGE scores and the reranker providing additional gains; LLM-based evaluations corroborate these findings. The work demonstrates that aligning decoding with training via SeqMAP and leveraging a reranker yields strong, transferable improvements for length-controlled, non-autoregressive summarization, offering practical benefits for domains with strict output length constraints.

Abstract

Length-control summarization aims to condense long texts into a short one within a certain length limit. Previous approaches often use autoregressive (AR) models and treat the length requirement as a soft constraint, which may not always be satisfied. In this study, we propose a novel length-control decoding algorithm based on the Directed Acyclic Transformer (DAT). Our approach allows for multiple plausible sequence fragments and predicts a \emph{path} to connect them. In addition, we propose a Sequence Maximum a Posteriori (SeqMAP) decoding algorithm that marginalizes different possible paths and finds the most probable summary satisfying the length budget. Our algorithm is based on beam search, which further facilitates a reranker for performance improvement. Experimental results on the Gigaword and DUC2004 datasets demonstrate our state-of-the-art performance for length-control summarization.

A Decoding Algorithm for Length-Control Summarization Based on Directed Acyclic Transformers

TL;DR

The paper tackles exact length-controlled summarization by adapting the Directed Acyclic Transformer (DAT) to generate multiple plausible fragments across an expanded canvas of steps and connect them via a path. It introduces SeqMAP, a decoding objective that marginalizes over all possible linking paths to find the most probable summary under a length budget, implemented via a beam-search dynamic programming algorithm with Expand and Merge steps, plus an optional RoBERTa-based reranker. Empirical results on Gigaword and DUC2004 show SeqMAP and PathMAP surpass CTC-based baselines, with SeqMAP achieving superior ROUGE scores and the reranker providing additional gains; LLM-based evaluations corroborate these findings. The work demonstrates that aligning decoding with training via SeqMAP and leveraging a reranker yields strong, transferable improvements for length-controlled, non-autoregressive summarization, offering practical benefits for domains with strict output length constraints.

Abstract

Length-control summarization aims to condense long texts into a short one within a certain length limit. Previous approaches often use autoregressive (AR) models and treat the length requirement as a soft constraint, which may not always be satisfied. In this study, we propose a novel length-control decoding algorithm based on the Directed Acyclic Transformer (DAT). Our approach allows for multiple plausible sequence fragments and predicts a \emph{path} to connect them. In addition, we propose a Sequence Maximum a Posteriori (SeqMAP) decoding algorithm that marginalizes different possible paths and finds the most probable summary satisfying the length budget. Our algorithm is based on beam search, which further facilitates a reranker for performance improvement. Experimental results on the Gigaword and DUC2004 datasets demonstrate our state-of-the-art performance for length-control summarization.

Paper Structure

This paper contains 12 sections, 11 equations, 1 figure, 9 tables.

Figures (1)

  • Figure 1: The neural architecture of our reranker. This example assumes a beam size of $K = 3$.