Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice

Tian-Yi Zhou; Matthew Lau; Jizhou Chen; Wenke Lee; Xiaoming Huo

Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice

Tian-Yi Zhou, Matthew Lau, Jizhou Chen, Wenke Lee, Xiaoming Huo

TL;DR

This work establishes non-asymptotic upper bounds and a convergence rate on the excess risk on rectified linear unit (ReLU) neural networks trained on synthetic anomalies trained on synthetic anomalies to provide the first theoretical guarantees of unsupervised neural network-based anomaly detectors.

Abstract

Anomaly detection is an important problem in many application areas, such as network security. Many deep learning methods for unsupervised anomaly detection produce good empirical performance but lack theoretical guarantees. By casting anomaly detection into a binary classification problem, we establish non-asymptotic upper bounds and a convergence rate on the excess risk on rectified linear unit (ReLU) neural networks trained on synthetic anomalies. Our convergence rate on the excess risk matches the minimax optimal rate in the literature. Furthermore, we provide lower and upper bounds on the number of synthetic anomalies that can attain this optimality. For practical implementation, we relax some conditions to improve the search for the empirical risk minimizer, which leads to competitive performance to other classification-based methods for anomaly detection. Overall, our work provides the first theoretical guarantees of unsupervised neural network-based anomaly detectors and empirical insights on how to design them well.

Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice

TL;DR

Abstract

Paper Structure (34 sections, 12 theorems, 136 equations, 1 figure, 6 tables)

This paper contains 34 sections, 12 theorems, 136 equations, 1 figure, 6 tables.

Introduction
Problem Formulation
Model Anomaly Detection as Density Level Set Estimation
Density Level Set Estimation as Binary Classification
Proposed Method --- Empirical Risk Minimization with ReLU Neural Network
Mathematical Formulation of ReLU Neural Networks
Target Neural Network Function Class
Finding Classifier via Empirical Risk Minimization
Theoretical Guarantee of ReLU Network for Unsupervised AD
Experiments
Real-World Datasets for Evaluations
Excess Risk Convergence Experiments
Practical Implementation
Results and Discussion
Evaluation with Anomaly Detection Metrics
...and 19 more sections

Key Result

Theorem 1

Let $n\geq 3, d\in \mathbb N, \alpha, r >0, 1/2\leq s \leq 1$. Consider the hypothesis class $\mathcal{H}_\tau$ defined in Definition hypothesis with $N =\left\lceil\left(\frac{n}{(\log n)^4}\right)^{\frac{d}{d+ \alpha (q+2)}}\right\rceil , m= \left\lceil \left(1+ \frac{\alpha}{d}\right) \frac{\log where $\widetilde{C}$ is a positive constant independent of $n$ or $\delta$.

Figures (1)

Figure 1: Accuracy on NSL-KDD network intrusion dataset, with various test data. Mean and standard deviation across 3 runs plotted. Accuracy converges with more training samples.

Theorems & Definitions (22)

Definition 1: Hypothesis Space
Theorem 1
Theorem 2
Lemma 1: Theorem 5 in schmidt2020nonparametric
Lemma 2: Decomposition of $\varepsilon(\Hat{f}_{T, T^\prime,\phi}) - \varepsilon(f_c)$
proof
Lemma 3
proof
Lemma 4
proof
...and 12 more

Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice

TL;DR

Abstract

Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (22)