Towards interfacing large language models with ASR systems using confidence measures and prompting

Maryam Naderi; Enno Hermann; Alexandre Nanchen; Sevada Hovsepyan; Mathew Magimai. -Doss

Towards interfacing large language models with ASR systems using confidence measures and prompting

Maryam Naderi, Enno Hermann, Alexandre Nanchen, Sevada Hovsepyan, Mathew Magimai. -Doss

TL;DR

This work investigates post-hoc correction of ASR transcripts with LLMs and proposes a range of confidence-based filtering methods that can improve the performance of less competitive ASR systems.

Abstract

As large language models (LLMs) grow in parameter size and capabilities, such as interaction through prompting, they open up new ways of interfacing with automatic speech recognition (ASR) systems beyond rescoring n-best lists. This work investigates post-hoc correction of ASR transcripts with LLMs. To avoid introducing errors into likely accurate transcripts, we propose a range of confidence-based filtering methods. Our results indicate that this can improve the performance of less competitive ASR systems.

Towards interfacing large language models with ASR systems using confidence measures and prompting

TL;DR

This work investigates post-hoc correction of ASR transcripts with LLMs and proposes a range of confidence-based filtering methods that can improve the performance of less competitive ASR systems.

Abstract

Paper Structure (16 sections, 3 figures, 5 tables)

This paper contains 16 sections, 3 figures, 5 tables.

Introduction
Related works
Experimental setup
ASR system
Large language model
Confidence-based filtering
Dataset
Results
Prompt selection
Influence of ASR performance
Confidence-based filtering
Correction of specific words
Test set performance
Error analysis
Conclusions
...and 1 more sections

Figures (3)

Figure 1: Proposed approach (left) and speech processing in the brain (right).
Figure 2: WER for various sentence-level (left) and lowest-word (right) confidence thresholds for Tiny, Medium, and Large V3 Whisper models applied on dev-clean dataset with gpt-3.5-turbo-1106.
Figure 3: WER for various thresholds for specific low-confidence words with Tiny Whisper model applied on dev-clean dataset with gpt-3.5-turbo-1106.

Towards interfacing large language models with ASR systems using confidence measures and prompting

TL;DR

Abstract

Towards interfacing large language models with ASR systems using confidence measures and prompting

Authors

TL;DR

Abstract

Table of Contents

Figures (3)