Table of Contents
Fetching ...

Recent Trends in Unsupervised Summarization

Mohammad Khosravani, Amine Trabelsi

TL;DR

Unsupervised summarization addresses the problem of producing concise, informative summaries without labeled data. The paper presents a fine-grained taxonomy separating abstractive, extractive, and hybrid methods, and surveys advances across language-model-based generation, reconstruction-based training, and data-driven fine-tuning, including weakly/self-supervised and few-shot approaches. It surveys datasets and evaluation methods, analyzes trends and limitations, and discusses practical concerns such as the cost and reliability of large language models, as well as the challenges of long-/multi-document summarization. The work serves as a comprehensive reference for researchers to understand current techniques, datasets, and evaluation practices, and to identify promising directions for scalable and domain-adaptive unsupervised summarization.

Abstract

Unsupervised summarization is a powerful technique that enables training summarizing models without requiring labeled datasets. This survey covers different recent techniques and models used for unsupervised summarization. We cover extractive, abstractive, and hybrid models and strategies used to achieve unsupervised summarization. While the main focus of this survey is on recent research, we also cover some of the important previous research. We additionally introduce a taxonomy, classifying different research based on their approach to unsupervised training. Finally, we discuss the current approaches and mention some datasets and evaluation methods.

Recent Trends in Unsupervised Summarization

TL;DR

Unsupervised summarization addresses the problem of producing concise, informative summaries without labeled data. The paper presents a fine-grained taxonomy separating abstractive, extractive, and hybrid methods, and surveys advances across language-model-based generation, reconstruction-based training, and data-driven fine-tuning, including weakly/self-supervised and few-shot approaches. It surveys datasets and evaluation methods, analyzes trends and limitations, and discusses practical concerns such as the cost and reliability of large language models, as well as the challenges of long-/multi-document summarization. The work serves as a comprehensive reference for researchers to understand current techniques, datasets, and evaluation practices, and to identify promising directions for scalable and domain-adaptive unsupervised summarization.

Abstract

Unsupervised summarization is a powerful technique that enables training summarizing models without requiring labeled datasets. This survey covers different recent techniques and models used for unsupervised summarization. We cover extractive, abstractive, and hybrid models and strategies used to achieve unsupervised summarization. While the main focus of this survey is on recent research, we also cover some of the important previous research. We additionally introduce a taxonomy, classifying different research based on their approach to unsupervised training. Finally, we discuss the current approaches and mention some datasets and evaluation methods.
Paper Structure (22 sections, 1 figure, 2 tables)

This paper contains 22 sections, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Hierarchical Structure of Our Taxonomy