Table of Contents
Fetching ...

Angel or Devil: Discriminating Hard Samples and Anomaly Contaminations for Unsupervised Time Series Anomaly Detection

Ruyi Zhang, Hongzuo Xu, Songlei Jian, Yusong Tan, Haifang Zhou, Rulin Xu

TL;DR

PLDA dynamically augments the training data through an iterative process that simultaneously mitigates anomaly contaminations while amplifying informative hard normal samples, enabling a more granular characterization of anomalous patterns.

Abstract

Training in unsupervised time series anomaly detection is constantly plagued by the discrimination between harmful `anomaly contaminations' and beneficial `hard normal samples'. These two samples exhibit analogous loss behavior that conventional loss-based methodologies struggle to differentiate. To tackle this problem, we propose a novel approach that supplements traditional loss behavior with `parameter behavior', enabling a more granular characterization of anomalous patterns. Parameter behavior is formalized by measuring the parametric response to minute perturbations in input samples. Leveraging the complementary nature of parameter and loss behaviors, we further propose a dual Parameter-Loss Data Augmentation method (termed PLDA), implemented within the reinforcement learning paradigm. During the training phase of anomaly detection, PLDA dynamically augments the training data through an iterative process that simultaneously mitigates anomaly contaminations while amplifying informative hard normal samples. PLDA demonstrates remarkable versatility, which can serve as an additional component that seamlessly integrated with existing anomaly detectors to enhance their detection performance. Extensive experiments on ten datasets show that PLDA significantly improves the performance of four distinct detectors by up to 8\%, outperforming three state-of-the-art data augmentation methods.

Angel or Devil: Discriminating Hard Samples and Anomaly Contaminations for Unsupervised Time Series Anomaly Detection

TL;DR

PLDA dynamically augments the training data through an iterative process that simultaneously mitigates anomaly contaminations while amplifying informative hard normal samples, enabling a more granular characterization of anomalous patterns.

Abstract

Training in unsupervised time series anomaly detection is constantly plagued by the discrimination between harmful `anomaly contaminations' and beneficial `hard normal samples'. These two samples exhibit analogous loss behavior that conventional loss-based methodologies struggle to differentiate. To tackle this problem, we propose a novel approach that supplements traditional loss behavior with `parameter behavior', enabling a more granular characterization of anomalous patterns. Parameter behavior is formalized by measuring the parametric response to minute perturbations in input samples. Leveraging the complementary nature of parameter and loss behaviors, we further propose a dual Parameter-Loss Data Augmentation method (termed PLDA), implemented within the reinforcement learning paradigm. During the training phase of anomaly detection, PLDA dynamically augments the training data through an iterative process that simultaneously mitigates anomaly contaminations while amplifying informative hard normal samples. PLDA demonstrates remarkable versatility, which can serve as an additional component that seamlessly integrated with existing anomaly detectors to enhance their detection performance. Extensive experiments on ten datasets show that PLDA significantly improves the performance of four distinct detectors by up to 8\%, outperforming three state-of-the-art data augmentation methods.

Paper Structure

This paper contains 32 sections, 2 theorems, 33 equations, 9 figures, 5 tables, 2 algorithms.

Key Result

Theorem 3.1

When a training sample $\mathbf{s}$ is disturbed with a small weight $\epsilon$, the gradient of the optimal parameter $\hat{\theta}_{\epsilon,\mathbf{s}}$ with respect to $\epsilon$ (i.e., parameter sensitivity) is: in which $H_{\hat{\theta}}=\frac{1}{n} \sum\limits_{i=1}^{n}\nabla_{\theta}^{2}L(\mathbf{s}_i, \theta)$ is the Hessian matrix.

Figures (9)

  • Figure 1: Hard normal sample analysis. Loss value alone fails to differentiate the hard normal samples, but adding a new dimension, i.e., parameter behavior, allows for clear discrimination of sample types.
  • Figure 2: The spectrograms of simple normal sample, hard normal sample, and anomaly contamination. Anomaly contamination contains more high-frequency components due to additional noise or abrupt changes. Hard normal sample also has some high-frequency components, but they are less pronounced than those in anomaly contamination.
  • Figure 3: Overview of PLDA. \ref{['fig:frameworkb']} illustrates the training workflow of TSAD, where PLDA acts as an additional component that iteratively augments the contaminated training set $\mathcal{S}_i$. \ref{['fig:frameworka']} shows the augmentation details. Specifically, in $t$-th iteration, PLDA augments the state $\mathbf{s}_t$ sampled from $\mathcal{S}_i$ as follows: (1) Agent selects optimal action $a_t$ that maximum expected total future reward. (2) Data augmentation applies the chosen action, resulting in the augmented training set $\mathcal{S}_{i,t+1}$. (3) State transition function $G$ generates next state $\mathbf{s}_{t+1}\in\mathcal{S}_{i,t+1}$ for analysis. (4) Data investigation computes the dual-dimensional reward $r_{t+1}$ of $\mathbf{s}_{t+1}$ based on parameter and loss behavior.
  • Figure 4: Illustration of sliding window with $w$ as the window size and $h$ as the stride. The sliding window converts the original time series into a series of samples. Ideally, we aim to reduce AC and enrich HS. However, the current methods either (a) enrich both AC and HS or (b) reduce them simultaneously. On the contrary, as shown in (c), our adaptive method ensures AC reduction and HS enrichment.
  • Figure 5: F1 results of the four TSAD models augmented by four data augmentation methods on the training set with different contamination rates.
  • ...and 4 more figures

Theorems & Definitions (5)

  • Theorem 3.1
  • Theorem 3.2
  • proof
  • proof
  • proof