Performance and energy balance: a comprehensive study of state-of-the-art sound event detection systems

Francesca Ronchini; Romain Serizel

Performance and energy balance: a comprehensive study of state-of-the-art sound event detection systems

Francesca Ronchini, Romain Serizel

TL;DR

The paper addresses the environmental impact and energy footprint of deep learning-based sound event detection by analyzing DCASE Task 4 submissions from 2022 and 2023. It standardizes energy reporting, uses MACs via THOP, and introduces hardware-aware EW-PSDS with the relation $EW-PSDS = PSDS \cdot \frac{kWh_{baseline}}{kWh_{submission}}$ to balance performance and energy. The findings show that energy consumption and model complexity do not always align with performance, and ensembles can boost accuracy at higher energy cost, while thresholding can reduce footprint with limited PSDS loss. The work advocates multi-metric, task-aware energy evaluations to guide sustainable design of SED systems and reduce environmental impact in practical deployments.

Abstract

In recent years, deep learning systems have shown a concerning trend toward increased complexity and higher energy consumption. As researchers in this domain and organizers of one of the Detection and Classification of Acoustic Scenes and Events challenges tasks, we recognize the importance of addressing the environmental impact of data-driven SED systems. In this paper, we propose an analysis focused on SED systems based on the challenge submissions. This includes a comparison across the past two years and a detailed analysis of this year's SED systems. Through this research, we aim to explore how the SED systems are evolving every year in relation to their energy efficiency implications.

Performance and energy balance: a comprehensive study of state-of-the-art sound event detection systems

TL;DR

to balance performance and energy. The findings show that energy consumption and model complexity do not always align with performance, and ensembles can boost accuracy at higher energy cost, while thresholding can reduce footprint with limited PSDS loss. The work advocates multi-metric, task-aware energy evaluations to guide sustainable design of SED systems and reduce environmental impact in practical deployments.

Abstract

Paper Structure (9 sections, 1 equation, 9 figures, 2 tables)

This paper contains 9 sections, 1 equation, 9 figures, 2 tables.

Introduction
Analysis setup and evaluation metrics
General comparison between DCASE 2022 and DCASE 2023 systems
Relation between system complexity, MACs, and energy consumption
Relation between performance and energy consumption
Comparison between ensemble/non-ensemble systems
Relation between EW-PSDS and PSDS
Thresholding based on energy consumption
Conclusions

Figures (9)

Figure 1: Relation between system complexity and energy consumption at training for 2023 entries, compared with the two baselines systems.
Figure 2: Relation between system complexity and energy consumption at test for 2023 entries, compared with the two baselines systems.
Figure 3: PSDS_1 and energy consumption at training for best 2023 systems, compared with the two baselines systems.
Figure 4: PSDS_1 and energy consumption at test for best performance 2023 systems, compared with the two baselines systems.
Figure 5: Relation between MACs and energy consumption at training for 2023 entries, compared with the two baselines systems.
...and 4 more figures

Performance and energy balance: a comprehensive study of state-of-the-art sound event detection systems

TL;DR

Abstract

Performance and energy balance: a comprehensive study of state-of-the-art sound event detection systems

Authors

TL;DR

Abstract

Table of Contents

Figures (9)