Table of Contents
Fetching ...

How Much Does Machine Identity Matter in Anomalous Sound Detection at Test Time?

Kevin Wilkinghoff, Keisuke Imoto, Zheng-Hua Tan

TL;DR

Experiments with representative ASD methods show that relaxing this assumption reveals performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, and that these degradations are strongly related to implicit machine identification accuracy.

Abstract

Anomalous sound detection (ASD) benchmarks typically assume that the identity of the monitored machine is known at test time and that recordings are evaluated in a machine-wise manner. However, in realistic monitoring scenarios with multiple known machines operating concurrently, test recordings may not be reliably attributable to a specific machine, and requiring machine identity imposes deployment constraints such as dedicated sensors per machine. To reveal performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, we consider a minimal modification of the ASD evaluation protocol in which test recordings from multiple machines are merged and evaluated jointly without access to machine identity at inference time. Training data and evaluation metrics remain unchanged, and machine identity labels are used only for post hoc evaluation. Experiments with representative ASD methods show that relaxing this assumption reveals performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, and that these degradations are strongly related to implicit machine identification accuracy.

How Much Does Machine Identity Matter in Anomalous Sound Detection at Test Time?

TL;DR

Experiments with representative ASD methods show that relaxing this assumption reveals performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, and that these degradations are strongly related to implicit machine identification accuracy.

Abstract

Anomalous sound detection (ASD) benchmarks typically assume that the identity of the monitored machine is known at test time and that recordings are evaluated in a machine-wise manner. However, in realistic monitoring scenarios with multiple known machines operating concurrently, test recordings may not be reliably attributable to a specific machine, and requiring machine identity imposes deployment constraints such as dedicated sensors per machine. To reveal performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, we consider a minimal modification of the ASD evaluation protocol in which test recordings from multiple machines are merged and evaluated jointly without access to machine identity at inference time. Training data and evaluation metrics remain unchanged, and machine identity labels are used only for post hoc evaluation. Experiments with representative ASD methods show that relaxing this assumption reveals performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, and that these degradations are strongly related to implicit machine identification accuracy.
Paper Structure (17 sections, 4 equations, 2 figures, 1 table)

This paper contains 17 sections, 4 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Comparison of the standard DCASE evaluation protocol and the proposed evaluation without machine identity. Training data and anomaly detection metrics are unchanged; machine identity is unavailable at inference time, and machine identification accuracy is reported as an auxiliary metric.
  • Figure 2: Relation between chance-normalized machine identification accuracy and anomaly detection performance degradation across asd systems. The dashed line shows a global least-squares fit over all plotted data points. Dataset-specific fits (not shown) exhibit negative slopes, with reduced magnitude when identification accuracy is already near its ceiling for most systems.