SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

Jacob Parnell; Inigo Jauregi Unanue; Massimo Piccardi

SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

Jacob Parnell, Inigo Jauregi Unanue, Massimo Piccardi

TL;DR

This paper proposes revisiting the summarize-and-translate pipeline, where the summarization and translation tasks are performed in a sequence, and allows reusing the many, publicly-available resources for monolingual summarization and translation, obtaining a very competitive zero-shot performance.

Abstract

Cross-lingual summarization (XLS) generates summaries in a language different from that of the input documents (e.g., English to Spanish), allowing speakers of the target language to gain a concise view of their content. In the present day, the predominant approach to this task is to take a performing, pretrained multilingual language model (LM) and fine-tune it for XLS on the language pairs of interest. However, the scarcity of fine-tuning samples makes this approach challenging in some cases. For this reason, in this paper we propose revisiting the summarize-and-translate pipeline, where the summarization and translation tasks are performed in a sequence. This approach allows reusing the many, publicly-available resources for monolingual summarization and translation, obtaining a very competitive zero-shot performance. In addition, the proposed pipeline is completely differentiable end-to-end, allowing it to take advantage of few-shot fine-tuning, where available. Experiments over two contemporary and widely adopted XLS datasets (CrossSum and WikiLingua) have shown the remarkable zero-shot performance of the proposed approach, and also its strong few-shot performance compared to an equivalent multilingual LM baseline, that the proposed approach has been able to outperform in many languages with only 10% of the fine-tuning samples.

SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

TL;DR

Abstract

Paper Structure (22 sections, 6 equations, 5 figures, 11 tables)

This paper contains 22 sections, 6 equations, 5 figures, 11 tables.

Introduction
Related Work
SumTra
Experimental Setup
Datasets, Baselines, Evaluation Metrics
Model Training
Results and Analysis
Alternative Monolingual Training
Cross-Domain Analysis
The Catastrophic Forgetting Problem
Qualitative Analysis
Inference Time
Conclusion
Appendix
Experimental Setup
...and 7 more sections

Figures (5)

Figure 1: Performance comparison between SumTra models trained with CNN/DM and XSum, and with the CrossSum English training split.
Figure 2: Cross-domain mROUGE/BERTScore scores for Spanish and Arabic. Left: CrossSum-tuned and WikiLingua-tested; Right: vice versa. We have also included mBART-50 (1000-shot) to highlight SumTra's few-shot capability.
Figure 3: Exploring the catastrophic forgetting problem with mBART-50, mBART-50-mono and SumTra on the CrossSum Spanish and Bengali test sets.
Figure 4: BERTScore scores for the CrossSum Spanish and Bengali test sets with different fine-tuning configurations (summarizer only, translator only, and both).
Figure 5: BERTScore and MoverScore comparison over the Spanish and Bengali test sets (CrossSum)).

SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

TL;DR

Abstract

SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

Authors

TL;DR

Abstract

Table of Contents

Figures (5)