Table of Contents
Fetching ...

An experiment on an automated literature survey of data-driven speech enhancement methods

Arthur dos Santos, Jayr Pereira, Rodrigo Nogueira, Bruno Masiero, Shiva Sander-Tavallaey, Elias Zea

TL;DR

The paper tackles the challenge of keeping pace with rapidly growing acoustics literature by testing an automated GPT-based approach to survey data-driven speech enhancement papers. It processes 116 papers from 2021 using a GPT-3.5-turbo-16k model, converting PDFs to text and answering four predefined queries, then comparing results to a human-based ground truth. Findings show the approach can accurately answer simple questions (e.g., author-country), but more nuanced questions about channel configurations, architectures, and applications require better prompts, metadata context, or model fine-tuning. This work demonstrates the feasibility of AI-assisted literature reviews in acoustics and points to avenues for scaling surveys to thousands of papers with improved reliability.

Abstract

The increasing number of scientific publications in acoustics, in general, presents difficulties in conducting traditional literature surveys. This work explores the use of a generative pre-trained transformer (GPT) model to automate a literature survey of 116 articles on data-driven speech enhancement methods. The main objective is to evaluate the capabilities and limitations of the model in providing accurate responses to specific queries about the papers selected from a reference human-based survey. While we see great potential to automate literature surveys in acoustics, improvements are needed to address technical questions more clearly and accurately.

An experiment on an automated literature survey of data-driven speech enhancement methods

TL;DR

The paper tackles the challenge of keeping pace with rapidly growing acoustics literature by testing an automated GPT-based approach to survey data-driven speech enhancement papers. It processes 116 papers from 2021 using a GPT-3.5-turbo-16k model, converting PDFs to text and answering four predefined queries, then comparing results to a human-based ground truth. Findings show the approach can accurately answer simple questions (e.g., author-country), but more nuanced questions about channel configurations, architectures, and applications require better prompts, metadata context, or model fine-tuning. This work demonstrates the feasibility of AI-assisted literature reviews in acoustics and points to avenues for scaling surveys to thousands of papers with improved reliability.

Abstract

The increasing number of scientific publications in acoustics, in general, presents difficulties in conducting traditional literature surveys. This work explores the use of a generative pre-trained transformer (GPT) model to automate a literature survey of 116 articles on data-driven speech enhancement methods. The main objective is to evaluate the capabilities and limitations of the model in providing accurate responses to specific queries about the papers selected from a reference human-based survey. While we see great potential to automate literature surveys in acoustics, improvements are needed to address technical questions more clearly and accurately.
Paper Structure (12 sections, 3 figures, 4 tables)

This paper contains 12 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Four decades of articles on applications of data-driven methods in acoustics. The results have been obtained from a Scopus search on August 2, 2023.
  • Figure 2: Simplified pie-charts for the human-based survey dosSantos2022.
  • Figure 3: Stacked bar charts for the machine-based answers to the four questions using the corpus of 116 papers.