An experiment on an automated literature survey of data-driven speech enhancement methods
Arthur dos Santos, Jayr Pereira, Rodrigo Nogueira, Bruno Masiero, Shiva Sander-Tavallaey, Elias Zea
TL;DR
The paper tackles the challenge of keeping pace with rapidly growing acoustics literature by testing an automated GPT-based approach to survey data-driven speech enhancement papers. It processes 116 papers from 2021 using a GPT-3.5-turbo-16k model, converting PDFs to text and answering four predefined queries, then comparing results to a human-based ground truth. Findings show the approach can accurately answer simple questions (e.g., author-country), but more nuanced questions about channel configurations, architectures, and applications require better prompts, metadata context, or model fine-tuning. This work demonstrates the feasibility of AI-assisted literature reviews in acoustics and points to avenues for scaling surveys to thousands of papers with improved reliability.
Abstract
The increasing number of scientific publications in acoustics, in general, presents difficulties in conducting traditional literature surveys. This work explores the use of a generative pre-trained transformer (GPT) model to automate a literature survey of 116 articles on data-driven speech enhancement methods. The main objective is to evaluate the capabilities and limitations of the model in providing accurate responses to specific queries about the papers selected from a reference human-based survey. While we see great potential to automate literature surveys in acoustics, improvements are needed to address technical questions more clearly and accurately.
