Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling

Lukas Schynol; Marius Pesavento

Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling

Lukas Schynol, Marius Pesavento

TL;DR

This work considers AD in network flows using incomplete measurements, leveraging a robust tensor decomposition approach and deep unrolling techniques to address these challenges, and proposes a novel block-successive convex approximation algorithm based on a regularized model-fitting objective.

Abstract

Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems. While deep learning has shown state-of-the-art AD performance, its application in critical systems is hindered by concerns regarding training data efficiency, domain adaptation and interpretability. This work considers AD in network flows using incomplete measurements, leveraging a robust tensor decomposition approach and deep unrolling techniques to address these challenges. We first propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective where the normal flows are modeled as low-rank tensors and anomalies as sparse. An augmentation of the objective is introduced to decrease the computational cost. We apply deep unrolling to derive a novel deep network architecture based on our proposed algorithm, treating the regularization parameters as learnable weights. Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics, improving AD performance while maintaining a low parameter count and preserving the problem's permutation equivariances. To optimize the deep network weights for detection performance, we employ a homotopy optimization approach based on an efficient approximation of the area under the receiver operating characteristic curve. Extensive experiments on synthetic and real-world data demonstrate that our proposed deep network architecture exhibits a high training data efficiency, outperforms reference methods, and adapts seamlessly to varying network topologies.

Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling

TL;DR

Abstract

Paper Structure (32 sections, 2 theorems, 29 equations, 7 figures, 10 tables, 2 algorithms)

This paper contains 32 sections, 2 theorems, 29 equations, 7 figures, 10 tables, 2 algorithms.

Introduction
Paper Outline
Notation
Related Work
System Model
Low-Rank Tensor Recovery Algorithm
Matrix-Factorization-Based Recovery Problem
CPD-Based Recovery Problem
BSCA Algorithm
Augmented CPD-Based Recovery Problem
Augmented BSCA Algorithm
Unrolled CPD-Based Anomaly Detection
Metric for AD Performance
Deep Unrolling
Non-Adaptive Unrolled BSCA-Based RPCA
...and 17 more sections

Key Result

Proposition 4.1

Any limit point of the sequence $(\bm{P}^{(\ell)}, \bm{Q}_1^{(\ell)}, \bm{Q}_2^{(\ell)}, \bm{A}^{(\ell)})_\ell$ returned by Alg. alg:tens_bsca for $L\to\infty$ is a stationary point of eq:objfun_tens.

Figures (7)

Figure 1: Example network with $N=4$ nodes and $F=3$ directed flows routed across the edges. The directed edge traffic $y_{2\to 4}$ from node $2$ to $4$ is a superposition of the directed flow from node $1$ to $4$ (blue) and from node $3$ to $4$ (red).
Figure 2: Average over the regularization parameters $(\lambda, \mu)$ after 100 iterations of BSCA-AD for the data set S1. Grid resolution: $0.25$.
Figure 3: Block diagram of the unrolled U-tBSCA-AUG, where each layer $\breve{\mathcal{F}}_\mathrm{td|aug}$ consists of the block updates of Alg. \ref{['alg:tens_bsca']} or Alg. \ref{['alg:tens_bsca_rlx']}, respectively. The parameters $\nu^{(\ell)}$, $\lambda^{(\ell)}$ and $\mu^{(\ell)}$ are learnable.
Figure 4: Block diagram of the adaptive unrolled algorithm AU-tBSCA-AUG, where $\breve{\mathcal{F}}_{\mathrm{aug}}$ is one layer of U-tBSCA-AUG, $f_\mathrm{ft}$ is a permutation invariant feature map, $f_{\mathrm{par}}$ is a learnable map for the parameters $\bm{\mathcal{W}}$ and $\bm{\mathcal{M}}$, respectively, and $\bm{h}_{\bm{\mathcal{W}}}^{(\ell)}$ and $\bm{h}_{\bm{\mathcal{M}}}^{(\ell)}$ are the embeddings per link or flow and time step, respectively.
Figure 5: Estimated validation of non-unrolled methods averaged over 50 scenarios.
...and 2 more figures

Theorems & Definitions (3)

Remark
Proposition 4.1
Proposition 4.2

Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling

TL;DR

Abstract

Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (3)