Table of Contents
Fetching ...

Action-Item-Driven Summarization of Long Meeting Transcripts

Logan Golia, Jugal Kalita

TL;DR

The paper tackles automatic, action-item–driven summarization of long meeting transcripts, addressing long-range dependencies that hinder standard dialogue-based summaries. It proposes a divide-and-conquer, recursive framework that partitions transcripts into topic-based chunks, generates per-chunk general and action-item summaries, and then recursively combines them. Three novel topic segmentation methods, a neighborhood-based action-item extraction pipeline, and parallelized processing yield improved semantic alignment (as measured by BERTScore) on the AMI corpus, surpassing prior BART-based approaches. The work demonstrates that action-item-driven, recursive summaries are more informative and coherent for meeting minutes and suggests broader applicability to other long-form text scenarios.

Abstract

The increased prevalence of online meetings has significantly enhanced the practicality of a model that can automatically generate the summary of a given meeting. This paper introduces a novel and effective approach to automate the generation of meeting summaries. Current approaches to this problem generate general and basic summaries, considering the meeting simply as a long dialogue. However, our novel algorithms can generate abstractive meeting summaries that are driven by the action items contained in the meeting transcript. This is done by recursively generating summaries and employing our action-item extraction algorithm for each section of the meeting in parallel. All of these sectional summaries are then combined and summarized together to create a coherent and action-item-driven summary. In addition, this paper introduces three novel methods for dividing up long transcripts into topic-based sections to improve the time efficiency of our algorithm, as well as to resolve the issue of large language models (LLMs) forgetting long-term dependencies. Our pipeline achieved a BERTScore of 64.98 across the AMI corpus, which is an approximately 4.98% increase from the current state-of-the-art result produced by a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model.

Action-Item-Driven Summarization of Long Meeting Transcripts

TL;DR

The paper tackles automatic, action-item–driven summarization of long meeting transcripts, addressing long-range dependencies that hinder standard dialogue-based summaries. It proposes a divide-and-conquer, recursive framework that partitions transcripts into topic-based chunks, generates per-chunk general and action-item summaries, and then recursively combines them. Three novel topic segmentation methods, a neighborhood-based action-item extraction pipeline, and parallelized processing yield improved semantic alignment (as measured by BERTScore) on the AMI corpus, surpassing prior BART-based approaches. The work demonstrates that action-item-driven, recursive summaries are more informative and coherent for meeting minutes and suggests broader applicability to other long-form text scenarios.

Abstract

The increased prevalence of online meetings has significantly enhanced the practicality of a model that can automatically generate the summary of a given meeting. This paper introduces a novel and effective approach to automate the generation of meeting summaries. Current approaches to this problem generate general and basic summaries, considering the meeting simply as a long dialogue. However, our novel algorithms can generate abstractive meeting summaries that are driven by the action items contained in the meeting transcript. This is done by recursively generating summaries and employing our action-item extraction algorithm for each section of the meeting in parallel. All of these sectional summaries are then combined and summarized together to create a coherent and action-item-driven summary. In addition, this paper introduces three novel methods for dividing up long transcripts into topic-based sections to improve the time efficiency of our algorithm, as well as to resolve the issue of large language models (LLMs) forgetting long-term dependencies. Our pipeline achieved a BERTScore of 64.98 across the AMI corpus, which is an approximately 4.98% increase from the current state-of-the-art result produced by a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model.
Paper Structure (23 sections, 2 tables)