Table of Contents
Fetching ...

Progress Ratio Embeddings: An Impatience Signal for Robust Length Control in Neural Text Generation

Ivanhoé Botcazou, Tassadit Amghar, Sylvain Lamprier, Frédéric Saubion

TL;DR

This work tackles the challenge of robust length control in neural text generation by showing the limitations of existing Reverse Positional Embeddings (RPE) when target lengths lie outside the training distribution. It introduces Progress Ratio Embeddings (PRE), a continuous, trigonometric impatience signal encoded as generation progress, enabling stable length control that generalizes to unseen lengths. Integrated into a Transformer-based encoder–decoder (BART-L), PRE achieves superior length fidelity while preserving generation quality across CNN/DailyMail, XSum, and SQuAD tasks, outperforming RPE. The results suggest PRE as a practical, generalizable mechanism for fine-grained length control with potential extensions to other architectures and tasks.

Abstract

Modern neural language models achieve high accuracy in text generation, yet precise control over generation length remains underdeveloped. In this paper, we first investigate a recent length control method based on Reverse Positional Embeddings (RPE) and show its limits when control is requested beyond the training distribution. In particular, using a discrete countdown signal tied to the absolute remaining token count leads to instability. To provide robust length control, we introduce Progress Ratio Embeddings (PRE), as continuous embeddings tied to a trigonometric impatience signal. PRE integrates seamlessly into standard Transformer architectures, providing stable length fidelity without degrading text accuracy under standard evaluation metrics. We further show that PRE generalizes well to unseen target lengths. Experiments on two widely used news-summarization benchmarks validate these findings.

Progress Ratio Embeddings: An Impatience Signal for Robust Length Control in Neural Text Generation

TL;DR

This work tackles the challenge of robust length control in neural text generation by showing the limitations of existing Reverse Positional Embeddings (RPE) when target lengths lie outside the training distribution. It introduces Progress Ratio Embeddings (PRE), a continuous, trigonometric impatience signal encoded as generation progress, enabling stable length control that generalizes to unseen lengths. Integrated into a Transformer-based encoder–decoder (BART-L), PRE achieves superior length fidelity while preserving generation quality across CNN/DailyMail, XSum, and SQuAD tasks, outperforming RPE. The results suggest PRE as a practical, generalizable mechanism for fine-grained length control with potential extensions to other architectures and tasks.

Abstract

Modern neural language models achieve high accuracy in text generation, yet precise control over generation length remains underdeveloped. In this paper, we first investigate a recent length control method based on Reverse Positional Embeddings (RPE) and show its limits when control is requested beyond the training distribution. In particular, using a discrete countdown signal tied to the absolute remaining token count leads to instability. To provide robust length control, we introduce Progress Ratio Embeddings (PRE), as continuous embeddings tied to a trigonometric impatience signal. PRE integrates seamlessly into standard Transformer architectures, providing stable length fidelity without degrading text accuracy under standard evaluation metrics. We further show that PRE generalizes well to unseen target lengths. Experiments on two widely used news-summarization benchmarks validate these findings.

Paper Structure

This paper contains 25 sections, 8 equations, 14 figures, 9 tables.

Figures (14)

  • Figure 1: Illustration of Progress Ratio Embeddings added to token and positional embeddings to form the input of the decoder.
  • Figure 2: MAE by target-length bucket (10 tokens) for RPE-BART-L on CNN/DailyMail.
  • Figure 3: MAE by target-length bucket (25 tokens) for RPE-BART-L on CNN/DailyMail when target lengths above 300 tokens are requested.
  • Figure 4: MAE by target-length bucket (10 tokens) for PRE-BART-L on CNN/DailyMail.
  • Figure 5: MAE by target-length bucket (25 tokens) for PRE-BART-L on CNN/DailyMail when target lengths above 300 tokens are requested.
  • ...and 9 more figures