Unsupervised Extractive Dialogue Summarization in Hyperdimensional Space

Seongmin Park; Kyungho Kim; Jaejin Seo; Jihwa Lee

Unsupervised Extractive Dialogue Summarization in Hyperdimensional Space

Seongmin Park, Kyungho Kim, Jaejin Seo, Jihwa Lee

TL;DR

HyperSum tackles the need for fast, faithful unsupervised extractive dialogue summarization. It builds sentence embeddings in a high-dimensional space ($D=10{,}000$) using thermometer encoding and position-aware binding, then selects central sentences via $k$-medoids. The results show HyperSum often surpasses state-of-the-art baselines in ROUGE and ExtEval while being orders of magnitude faster on CPU. The work provides a strong new baseline and open-source release for unsupervised extractive dialogue summarization.

Abstract

We present HyperSum, an extractive summarization framework that captures both the efficiency of traditional lexical summarization and the accuracy of contemporary neural approaches. HyperSum exploits the pseudo-orthogonality that emerges when randomly initializing vectors at extremely high dimensions ("blessing of dimensionality") to construct representative and efficient sentence embeddings. Simply clustering the obtained embeddings and extracting their medoids yields competitive summaries. HyperSum often outperforms state-of-the-art summarizers -- in terms of both summary accuracy and faithfulness -- while being 10 to 100 times faster. We open-source HyperSum as a strong baseline for unsupervised extractive summarization.

Unsupervised Extractive Dialogue Summarization in Hyperdimensional Space

TL;DR

HyperSum tackles the need for fast, faithful unsupervised extractive dialogue summarization. It builds sentence embeddings in a high-dimensional space (

) using thermometer encoding and position-aware binding, then selects central sentences via

-medoids. The results show HyperSum often surpasses state-of-the-art baselines in ROUGE and ExtEval while being orders of magnitude faster on CPU. The work provides a strong new baseline and open-source release for unsupervised extractive dialogue summarization.

Abstract

Paper Structure (14 sections, 2 equations, 1 figure, 7 tables)

This paper contains 14 sections, 2 equations, 1 figure, 7 tables.

Introduction
Background
Hyperdimensional computing
Extractive summarization
Methodology
Constructing sentence embeddings
Summary extraction
Experiments and results
Datasets and metrics
Summarization accuracy
Summarization faithfulness
Summarization execution time
Ablations
Conclusion

Figures (1)

Figure 1: HyperSum's utterance embeddings for clip #6 from the Behance dataset, visualized with t-SNE van2008visualizing. Different shapes denote different sentence clusters. Shaded markers in each cluster are medoids, which are selected as its representative summary.

Unsupervised Extractive Dialogue Summarization in Hyperdimensional Space

TL;DR

Abstract

Unsupervised Extractive Dialogue Summarization in Hyperdimensional Space

Authors

TL;DR

Abstract

Table of Contents

Figures (1)