ConVerSum: A Contrastive Learning-based Approach for Data-Scarce Solution of Cross-Lingual Summarization Beyond Direct Equivalents

Sanzana Karim Lora; M. Sohel Rahman; Rifat Shahriyar

ConVerSum: A Contrastive Learning-based Approach for Data-Scarce Solution of Cross-Lingual Summarization Beyond Direct Equivalents

Sanzana Karim Lora, M. Sohel Rahman, Rifat Shahriyar

TL;DR

A novel data-efficient approach is proposed, ConVerSum, for CLS leveraging the power of contrastive learning, generating versatile candidate summaries in different languages based on the given source document and contrasting these summaries with reference summaries concerning the given documents.

Abstract

Cross-lingual summarization (CLS) is a sophisticated branch in Natural Language Processing that demands models to accurately translate and summarize articles from different source languages. Despite the improvement of the subsequent studies, This area still needs data-efficient solutions along with effective training methodologies. To the best of our knowledge, there is no feasible solution for CLS when there is no available high-quality CLS data. In this paper, we propose a novel data-efficient approach, ConVerSum, for CLS leveraging the power of contrastive learning, generating versatile candidate summaries in different languages based on the given source document and contrasting these summaries with reference summaries concerning the given documents. After that, we train the model with a contrastive ranking loss. Then, we rigorously evaluate the proposed approach against current methodologies and compare it to powerful Large Language Models (LLMs)- Gemini, GPT 3.5, and GPT 4o proving our model performs better for low-resource languages' CLS. These findings represent a substantial improvement in the area, opening the door to more efficient and accurate cross-lingual summarizing techniques.

ConVerSum: A Contrastive Learning-based Approach for Data-Scarce Solution of Cross-Lingual Summarization Beyond Direct Equivalents

TL;DR

Abstract

Paper Structure (51 sections, 3 equations, 2 figures, 19 tables)

This paper contains 51 sections, 3 equations, 2 figures, 19 tables.

Introduction
Related Works
Pipeline Methods
End-to-End Methods
Contrastive Learning for Monolingual Abstractive Summarization
ConVerSum
First Phase: Candidate Summary Generation in Different Languages
Seq2Seq Model Selection
Diverse Summary Exploration
Second Phase: Candidate Summary Quality Measurement
Evaluation Function Initialization
Similarity Calculation
Candidate Summary Scoring
Third Phase: Contrastive Learning Approach
Positive-Negative Pair Construction
...and 36 more sections

Figures (2)

Figure 1: Example of CLS
Figure 2: General Structure of ConVerSum. Here CS, RS, Lang and Doc refer to candidate summary, reference summary, language and document, respectively. For better realization, we assume score1 > score2 > score3 > score4 … scoreN.

ConVerSum: A Contrastive Learning-based Approach for Data-Scarce Solution of Cross-Lingual Summarization Beyond Direct Equivalents

TL;DR

Abstract

ConVerSum: A Contrastive Learning-based Approach for Data-Scarce Solution of Cross-Lingual Summarization Beyond Direct Equivalents

Authors

TL;DR

Abstract

Table of Contents

Figures (2)