Table of Contents
Fetching ...

Weakly Supervised Anomaly Detection via Knowledge-Data Alignment

Haihong Zhao, Chenyi Zi, Yang Liu, Chen Zhang, Yan Zhou, Jia Li

TL;DR

KDAlign tackles weakly supervised anomaly detection by introducing knowledge-data alignment, which leverages rule knowledge expressed as propositional formulae and aligns it with data representations via Optimal Transport. A dual-encoder architecture maps data and rules into a common embedding space, and a differentiable OT loss integrates knowledge into WSAD training, providing robustness to noisy or incomplete rules. Empirical results across five real-world datasets show KDAlign consistently improves over baselines, with notable gains in challenging settings and resilience to noisy knowledge. This neural-symbolic integration enhances explainability and generalization in anomaly detection, offering a practical framework for incorporating expert rules into data-driven models.

Abstract

Anomaly detection (AD) plays a pivotal role in numerous web-based applications, including malware detection, anti-money laundering, device failure detection, and network fault analysis. Most methods, which rely on unsupervised learning, are hard to reach satisfactory detection accuracy due to the lack of labels. Weakly Supervised Anomaly Detection (WSAD) has been introduced with a limited number of labeled anomaly samples to enhance model performance. Nevertheless, it is still challenging for models, trained on an inadequate amount of labeled data, to generalize to unseen anomalies. In this paper, we introduce a novel framework Knowledge-Data Alignment (KDAlign) to integrate rule knowledge, typically summarized by human experts, to supplement the limited labeled data. Specifically, we transpose these rules into the knowledge space and subsequently recast the incorporation of knowledge as the alignment of knowledge and data. To facilitate this alignment, we employ the Optimal Transport (OT) technique. We then incorporate the OT distance as an additional loss term to the original objective function of WSAD methodologies. Comprehensive experimental results on five real-world datasets demonstrate that our proposed KDAlign framework markedly surpasses its state-of-the-art counterparts, achieving superior performance across various anomaly types.

Weakly Supervised Anomaly Detection via Knowledge-Data Alignment

TL;DR

KDAlign tackles weakly supervised anomaly detection by introducing knowledge-data alignment, which leverages rule knowledge expressed as propositional formulae and aligns it with data representations via Optimal Transport. A dual-encoder architecture maps data and rules into a common embedding space, and a differentiable OT loss integrates knowledge into WSAD training, providing robustness to noisy or incomplete rules. Empirical results across five real-world datasets show KDAlign consistently improves over baselines, with notable gains in challenging settings and resilience to noisy knowledge. This neural-symbolic integration enhances explainability and generalization in anomaly detection, offering a practical framework for incorporating expert rules into data-driven models.

Abstract

Anomaly detection (AD) plays a pivotal role in numerous web-based applications, including malware detection, anti-money laundering, device failure detection, and network fault analysis. Most methods, which rely on unsupervised learning, are hard to reach satisfactory detection accuracy due to the lack of labels. Weakly Supervised Anomaly Detection (WSAD) has been introduced with a limited number of labeled anomaly samples to enhance model performance. Nevertheless, it is still challenging for models, trained on an inadequate amount of labeled data, to generalize to unseen anomalies. In this paper, we introduce a novel framework Knowledge-Data Alignment (KDAlign) to integrate rule knowledge, typically summarized by human experts, to supplement the limited labeled data. Specifically, we transpose these rules into the knowledge space and subsequently recast the incorporation of knowledge as the alignment of knowledge and data. To facilitate this alignment, we employ the Optimal Transport (OT) technique. We then incorporate the OT distance as an additional loss term to the original objective function of WSAD methodologies. Comprehensive experimental results on five real-world datasets demonstrate that our proposed KDAlign framework markedly surpasses its state-of-the-art counterparts, achieving superior performance across various anomaly types.
Paper Structure (33 sections, 9 equations, 5 figures, 7 tables)

This paper contains 33 sections, 9 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Comparison between traditional WSAD approach (a) and our proposed knowledge-data alignment WSAD framework (KDAlign) (b). We can find that the traditional WSAD mainly focuses on learning from limited labeled data, while our proposed framework introduces knowledge as extra information to supplement limited labeled via knowledge-data alignment. Note that the samples in the unlabeled data source are usually regarded as normal samples, though the unlabeled data may be contaminated by noise jiang2023OptimizationParadigm8pang2021fewshot1.
  • Figure 2: Knowledge-data alignment WSAD framework. During the training phase, we firstly use $\phi_{X}$ and $\phi_{F}$ to map $\mathbf{X}$ and $\mathbf{F}$ to two separate embedding spaces and then leverage Optimal Transport (OT) techniques to compute the cost matrix $\mathrm{C}$, thereby obtaining the OT plan $\mathrm{S}$. Next, we compute OT distance $\langle\boldsymbol{\mathrm{C}}, \boldsymbol{\mathrm{S}}\rangle$ and add it as a loss term to the prediction loss term, forming a joint loss. Finally, we utilize the joint loss to train $\phi_{X}(\cdot)$ and $\phi_{O}(\cdot)$, aligning knowledge and data for incorporating knowledge. In the inference phase of the model, the test data directly yields results by $\phi_{X}$ and $\phi_{O}$. In the data space, both the abnormal and normal points can be aligned via OT.
  • Figure 3: Noisy knowledge study on KDAlign-ResNet.
  • Figure 4: Knowledge Acquisition: Using decision trees to simulate industrial rule knowledge.
  • Figure 5: The d-DNNF graph structure generated based on Formula. \ref{['cnf2ddnnfexample']}.