ASD-Diffusion: Anomalous Sound Detection with Diffusion Models

Fengrun Zhang; Xiang Xie; Kai Guo

ASD-Diffusion: Anomalous Sound Detection with Diffusion Models

Fengrun Zhang, Xiang Xie, Kai Guo

TL;DR

ASD-Diffusion addresses unsupervised anomalous sound detection under first-shot constraints, applying diffusion models to reconstruct corrupted acoustic features toward normal patterns. The method combines a diffusion-based reconstruction pipeline, a post-processing anomaly filter for localization, and DDIM-based acceleration to enable faster inference. Empirical results on the DCASE 2023 task 2 development set show a substantial improvement over baselines and strong cross-domain detection, with effective localization via the AF. The work demonstrates the viability of diffusion models for industrial ASD, offering both improved performance and practical speed, and points to future directions in unsupervised anomaly localization.

Abstract

Unsupervised Anomalous Sound Detection (ASD) aims to design a generalizable method that can be used to detect anomalies when only normal sounds are given. In this paper, Anomalous Sound Detection based on Diffusion Models (ASD-Diffusion) is proposed for ASD in real-world factories. In our pipeline, the anomalies in acoustic features are reconstructed from their noisy corrupted features into their approximate normal pattern. Secondly, a post-processing anomalies filter algorithm is proposed to detect anomalies that exhibit significant deviation from the original input after reconstruction. Furthermore, denoising diffusion implicit model is introduced to accelerate the inference speed by a longer sampling interval of the denoising process. The proposed method is innovative in the application of diffusion models as a new scheme. Experimental results on the development set of DCASE 2023 challenge task 2 outperform the baseline by 7.75%, demonstrating the effectiveness of the proposed method.

ASD-Diffusion: Anomalous Sound Detection with Diffusion Models

TL;DR

Abstract

Paper Structure (17 sections, 7 equations, 4 figures, 4 tables)

This paper contains 17 sections, 7 equations, 4 figures, 4 tables.

Introduction
Methods
Diffusion Models for ASD-Diffusion
DDPM
DDIM
Anomaly Detection with Diffusion Models
Experiments
Dataset
Experimental Settings
Evaluation Metrics
Results
Main Results
Comparison with Self-supervised Methods
Visualization of Anomaly Detection
Influence of AF Parameter
...and 2 more sections

Figures (4)

Figure 1: Forward and reverse process of DDPM.
Figure 2: The overview of ASD-Diffusion. In stage 1, $x_{t}$ is obtained by adding noise $\epsilon$ to $x_{0}$ through forward diffusion. A neural network $\epsilon_\theta\left(x_t, t\right)$ is trained to estimate the noise $\hat{\epsilon}$ from $x_{t}$. In stage 2, $\epsilon_\theta\left(x_t, t\right)$ reconstructs $\hat{x}_{0}$ on $x_{\hat{t}}$ after forward diffusion, $\mathcal{S}_{anomaly}$ is then calculated by an anomaly detection function.
Figure 3: Visualization of an anomalous audio (a) and a normal audio (b). First row: original FBank. Second row: reconstructed FBank. Third row: detection result of MAE. Last row: detection result with AF.
Figure 4: Performance of the proposed method under different K or ReLU functions.

ASD-Diffusion: Anomalous Sound Detection with Diffusion Models

TL;DR

Abstract

ASD-Diffusion: Anomalous Sound Detection with Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (4)