Table of Contents
Fetching ...

With Good MT There is No Need For End-to-End: A Case for Translate-then-Summarize Cross-lingual Summarization

Daniel Varab, Christian Hardmeier

TL;DR

This paper challenges the claim that end-to-end cross-lingual summarization (CLS) consistently outperforms pipeline approaches. It evaluates both paradigms across 39 source languages into English using the CrossSum dataset, showing that a translate-then-summarize pipeline with strong MT (M2M100) and BRIO-based summarization often outperforms end-to-end models, and that zero-shot end-to-end is not feasible on this dataset. A key finding is that translation quality, as measured by BLEU, correlates with CLS performance, enabling feasibility estimates for language pairs from public BLEU data. The work argues for rebalancing CLS research toward improving translation and monolingual summarization components rather than pursuing end-to-end designs in general, given current data constraints.

Abstract

Recent work has suggested that end-to-end system designs for cross-lingual summarization are competitive solutions that perform on par or even better than traditional pipelined designs. A closer look at the evidence reveals that this intuition is based on the results of only a handful of languages or using underpowered pipeline baselines. In this work, we compare these two paradigms for cross-lingual summarization on 39 source languages into English and show that a simple \textit{translate-then-summarize} pipeline design consistently outperforms even an end-to-end system with access to enormous amounts of parallel data. For languages where our pipeline model does not perform well, we show that system performance is highly correlated with publicly distributed BLEU scores, allowing practitioners to establish the feasibility of a language pair a priori. Contrary to recent publication trends, our result suggests that the combination of individual progress of monolingual summarization and translation tasks offers better performance than an end-to-end system, suggesting that end-to-end designs should be considered with care.

With Good MT There is No Need For End-to-End: A Case for Translate-then-Summarize Cross-lingual Summarization

TL;DR

This paper challenges the claim that end-to-end cross-lingual summarization (CLS) consistently outperforms pipeline approaches. It evaluates both paradigms across 39 source languages into English using the CrossSum dataset, showing that a translate-then-summarize pipeline with strong MT (M2M100) and BRIO-based summarization often outperforms end-to-end models, and that zero-shot end-to-end is not feasible on this dataset. A key finding is that translation quality, as measured by BLEU, correlates with CLS performance, enabling feasibility estimates for language pairs from public BLEU data. The work argues for rebalancing CLS research toward improving translation and monolingual summarization components rather than pursuing end-to-end designs in general, given current data constraints.

Abstract

Recent work has suggested that end-to-end system designs for cross-lingual summarization are competitive solutions that perform on par or even better than traditional pipelined designs. A closer look at the evidence reveals that this intuition is based on the results of only a handful of languages or using underpowered pipeline baselines. In this work, we compare these two paradigms for cross-lingual summarization on 39 source languages into English and show that a simple \textit{translate-then-summarize} pipeline design consistently outperforms even an end-to-end system with access to enormous amounts of parallel data. For languages where our pipeline model does not perform well, we show that system performance is highly correlated with publicly distributed BLEU scores, allowing practitioners to establish the feasibility of a language pair a priori. Contrary to recent publication trends, our result suggests that the combination of individual progress of monolingual summarization and translation tasks offers better performance than an end-to-end system, suggesting that end-to-end designs should be considered with care.
Paper Structure (18 sections, 2 figures, 1 table)

This paper contains 18 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Pipeline versus end-to-end cross-lingual summarization designs. Pipeline-based systems perform cross-lingual summarization over two steps, first translating and then summarizing (or vice versa). End-to-end systems conflate translation and summarization by training a sequence-to-sequence to perform both tasks simultaneously.
  • Figure 2: Collected BLEU scores on the x-axis and ROUGE-1 scores on the y-axis for TTS systems, including two outliers (Somali and Tamil) with suspiciously high BLEU scores. Removing the outliers further strengthens the relationship between the two metrics for TTS.