Evaluating Speech Enhancement Systems Through Listening Effort

Femke B. Gelderblom; Tron V. Tronstad; Iván López-Espejo

Evaluating Speech Enhancement Systems Through Listening Effort

Femke B. Gelderblom, Tron V. Tronstad, Iván López-Espejo

TL;DR

The paper addresses the challenge of evaluating speech enhancement systems by measuring listening effort (LE) alongside intelligibility. It proposes a simple, single-task method based on reaction times in a Hagerman matrix test, filtering incorrect responses and not signaling timing to subjects, enabling LE assessment without extra experimental burden. Across two independent studies (Norway and Denmark) with 76 participants and 9 processing conditions, the method demonstrated robust LE sensitivity, showing LE changes with $SNR$ and processing differences even when intelligibility was not severely affected. The findings suggest practical applicability for integrating LE measurements into SE development pipelines, offering a lightweight, standardized approach that complements traditional intelligibility metrics.

Abstract

Understanding degraded speech is demanding, requiring increased listening effort (LE). Evaluating processed and unprocessed speech with respect to LE can objectively indicate if speech enhancement systems benefit listeners. However, existing methods for measuring LE are complex and not widely applicable. In this study, we propose a simple method to evaluate speech intelligibility and LE simultaneously without additional strain on subjects or operators. We assess this method using results from two independent studies in Norway and Denmark, testing 76 (50+26) subjects across 9 (6+3) processing conditions. Despite differences in evaluation setups, subject recruitment, and processing systems, trends are strikingly similar, demonstrating the proposed method's robustness and ease of implementation into existing practices.

Evaluating Speech Enhancement Systems Through Listening Effort

TL;DR

and processing differences even when intelligibility was not severely affected. The findings suggest practical applicability for integrating LE measurements into SE development pipelines, offering a lightweight, standardized approach that complements traditional intelligibility metrics.

Abstract

Paper Structure (10 sections, 1 figure, 4 tables)

This paper contains 10 sections, 1 figure, 4 tables.

Introduction
Experiments
Norwegian Intelligibility Test Description
Danish Intelligibility Test Description
Statistical Analysis
Results
Norwegian Results
Danish Results
Discussion
Conclusion

Figures (1)

Figure 1: Regression lines with 95% confidence intervals (shaded areas) for the three groups in the Norwegian dataset.

Evaluating Speech Enhancement Systems Through Listening Effort

TL;DR

Abstract

Evaluating Speech Enhancement Systems Through Listening Effort

Authors

TL;DR

Abstract

Table of Contents

Figures (1)