Table of Contents
Fetching ...

Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice

Tian-Yi Zhou, Matthew Lau, Jizhou Chen, Wenke Lee, Xiaoming Huo

TL;DR

This work establishes non-asymptotic upper bounds and a convergence rate on the excess risk on rectified linear unit (ReLU) neural networks trained on synthetic anomalies trained on synthetic anomalies to provide the first theoretical guarantees of unsupervised neural network-based anomaly detectors.

Abstract

Anomaly detection is an important problem in many application areas, such as network security. Many deep learning methods for unsupervised anomaly detection produce good empirical performance but lack theoretical guarantees. By casting anomaly detection into a binary classification problem, we establish non-asymptotic upper bounds and a convergence rate on the excess risk on rectified linear unit (ReLU) neural networks trained on synthetic anomalies. Our convergence rate on the excess risk matches the minimax optimal rate in the literature. Furthermore, we provide lower and upper bounds on the number of synthetic anomalies that can attain this optimality. For practical implementation, we relax some conditions to improve the search for the empirical risk minimizer, which leads to competitive performance to other classification-based methods for anomaly detection. Overall, our work provides the first theoretical guarantees of unsupervised neural network-based anomaly detectors and empirical insights on how to design them well.

Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice

TL;DR

This work establishes non-asymptotic upper bounds and a convergence rate on the excess risk on rectified linear unit (ReLU) neural networks trained on synthetic anomalies trained on synthetic anomalies to provide the first theoretical guarantees of unsupervised neural network-based anomaly detectors.

Abstract

Anomaly detection is an important problem in many application areas, such as network security. Many deep learning methods for unsupervised anomaly detection produce good empirical performance but lack theoretical guarantees. By casting anomaly detection into a binary classification problem, we establish non-asymptotic upper bounds and a convergence rate on the excess risk on rectified linear unit (ReLU) neural networks trained on synthetic anomalies. Our convergence rate on the excess risk matches the minimax optimal rate in the literature. Furthermore, we provide lower and upper bounds on the number of synthetic anomalies that can attain this optimality. For practical implementation, we relax some conditions to improve the search for the empirical risk minimizer, which leads to competitive performance to other classification-based methods for anomaly detection. Overall, our work provides the first theoretical guarantees of unsupervised neural network-based anomaly detectors and empirical insights on how to design them well.
Paper Structure (34 sections, 12 theorems, 136 equations, 1 figure, 6 tables)

This paper contains 34 sections, 12 theorems, 136 equations, 1 figure, 6 tables.

Key Result

Theorem 1

Let $n\geq 3, d\in \mathbb N, \alpha, r >0, 1/2\leq s \leq 1$. Consider the hypothesis class $\mathcal{H}_\tau$ defined in Definition hypothesis with $N =\left\lceil\left(\frac{n}{(\log n)^4}\right)^{\frac{d}{d+ \alpha (q+2)}}\right\rceil , m= \left\lceil \left(1+ \frac{\alpha}{d}\right) \frac{\log where $\widetilde{C}$ is a positive constant independent of $n$ or $\delta$.

Figures (1)

  • Figure 1: Accuracy on NSL-KDD network intrusion dataset, with various test data. Mean and standard deviation across 3 runs plotted. Accuracy converges with more training samples.

Theorems & Definitions (22)

  • Definition 1: Hypothesis Space
  • Theorem 1
  • Theorem 2
  • Lemma 1: Theorem 5 in schmidt2020nonparametric
  • Lemma 2: Decomposition of $\varepsilon(\Hat{f}_{T, T^\prime,\phi}) - \varepsilon(f_c)$
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • ...and 12 more