Disentangling Hierarchical Features for Anomalous Sound Detection Under Domain Shift
Jian Guan, Jiantong Tian, Qiaoxi Zhu, Feiyang Xiao, Hejing Zhang, Xubo Liu
TL;DR
The paper tackles anomalous sound detection (ASD) under multi-domain shift and introduces Gradient Reversal-based Hierarchical feature Disentanglement (GRHD). GRHD combines a gradient reversal classifier to extract coarse domain-unrelated features $z_{rev}$ with a hierarchical metadata constraint that learns fine-grained domain-related features $z_{sec}$ and $z_{att}$, optimized by $L_{total} = \alpha L_{rev} + \beta L_{sec} + \gamma L_{att}$. Through adversarial learning and hierarchical constraints, GRHD achieves clearer separation of domain-related versus domain-unrelated features, improving ASD performance under domain shift. Evaluated on the DCASE 2022 Task 2 dataset, GRHD attains state-of-the-art HAUC, validating the effectiveness of both the gradient reversal mechanism and hierarchical metadata guidance for robust ASD in real-world, shifting environments.
Abstract
Anomalous sound detection (ASD) encounters difficulties with domain shift, where the sounds of machines in target domains differ significantly from those in source domains due to varying operating conditions. Existing methods typically employ domain classifiers to enhance detection performance, but they often overlook the influence of domain-unrelated information. This oversight can hinder the model's ability to clearly distinguish between domains, thereby weakening its capacity to differentiate normal from abnormal sounds. In this paper, we propose a Gradient Reversal-based Hierarchical feature Disentanglement (GRHD) method to address the above challenge. GRHD uses gradient reversal to separate domain-related features from domain-unrelated ones, resulting in more robust feature representations. Additionally, the method employs a hierarchical structure to guide the learning of fine-grained, domain-specific features by leveraging available metadata, such as section IDs and machine sound attributes. Experimental results on the DCASE 2022 Challenge Task 2 dataset demonstrate that the proposed method significantly improves ASD performance under domain shift.
