Table of Contents
Fetching ...

PARD-SSM: Probabilistic Cyber-Attack Regime Detection via Variational Switching State-Space Models

Prakul Sunil Hiremath, PeerAhammad M Bagawan, Sahil Bhekane

Abstract

Modern adversarial campaigns unfold as sequences of behavioural phases - Reconnaissance, Lateral Movement, Intrusion, and Exfiltration - each often indistinguishable from legitimate traffic when viewed in isolation. Existing intrusion detection systems (IDS) fail to capture this structure: signature-based methods cannot detect zero-day attacks, deep-learning models provide opaque anomaly scores without stage attribution, and standard Kalman Filters cannot model non-stationary multi-modal dynamics. We present PARD-SSM, a probabilistic framework that models network telemetry as a Regime-Dependent Switching Linear Dynamical System with K = 4 hidden regimes. A structured variational approximation reduces inference complexity from exponential to O(TK^2), enabling real-time detection on standard CPU hardware. An online EM algorithm adapts model parameters, while KL-divergence gating suppresses false positives. Evaluated on CICIDS2017 and UNSW-NB15, PARD-SSM achieves F1 scores of 98.2% and 97.1%, with latency less than 1.2 ms per flow. The model also produces predictive alerts approximately 8 minutes before attack onset, a capability absent in prior systems.

PARD-SSM: Probabilistic Cyber-Attack Regime Detection via Variational Switching State-Space Models

Abstract

Modern adversarial campaigns unfold as sequences of behavioural phases - Reconnaissance, Lateral Movement, Intrusion, and Exfiltration - each often indistinguishable from legitimate traffic when viewed in isolation. Existing intrusion detection systems (IDS) fail to capture this structure: signature-based methods cannot detect zero-day attacks, deep-learning models provide opaque anomaly scores without stage attribution, and standard Kalman Filters cannot model non-stationary multi-modal dynamics. We present PARD-SSM, a probabilistic framework that models network telemetry as a Regime-Dependent Switching Linear Dynamical System with K = 4 hidden regimes. A structured variational approximation reduces inference complexity from exponential to O(TK^2), enabling real-time detection on standard CPU hardware. An online EM algorithm adapts model parameters, while KL-divergence gating suppresses false positives. Evaluated on CICIDS2017 and UNSW-NB15, PARD-SSM achieves F1 scores of 98.2% and 97.1%, with latency less than 1.2 ms per flow. The model also produces predictive alerts approximately 8 minutes before attack onset, a capability absent in prior systems.

Paper Structure

This paper contains 64 sections, 4 theorems, 21 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Proposition 6.1

The exact posterior $p(\bm{x}_{1:T}, s_{1:T} \,|\, \bm{y}_{1:T})$ in a Switching LDS with $K$ regimes and $T$ time steps cannot be computed in time polynomial in $T$. The number of distinct Gaussian components in $p(\bm{x}_T \,|\, \bm{y}_{1:T})$ is exactly $K^T$. $\blacktriangleleft$$\blacktrianglel

Figures (3)

  • Figure 1: PARD-SSM system architecture. Six functional modules form a processing pipeline from raw network telemetry to probabilistic kill-chain alerts. The dashed feedback arrow denotes the online EM loop by which Module 5 (OEMPU) continuously updates regime-specific parameters in Module 3 (PRKFB).
  • Figure 2: Regime Posterior Probabilities vs. Time (CICIDS2017 scenario). The PARD-SSM system raises a Reconnaissance alert ($t=12$ min) approximately 8 minutes before the ground-truth attack onset ($t=20$ min), demonstrating predictive kill-chain detection capability. Posteriors sum to unity ($\sum_{s=0}^{3}\gamma_{t}(s)=1$) at each time-step.
  • Figure 3: Regime Transition Probability Matrix $\Pi$. Each cell $\Pi_{s,s'}$ encodes the learned probability of transitioning from regime $s_{t-1}$ (row) to regime $s_t$ (column). High diagonal values confirm that regimes are persistent, while off-diagonal kill-chain transitions ($0\rightarrow1\rightarrow2\rightarrow3$) enable predictive alerting.

Theorems & Definitions (7)

  • Remark 4.1: Eigenstructure as Discriminative Signal
  • Proposition 6.1: Intractability of Exact Switching Inference
  • proof
  • Proposition 6.2: ELBO Decomposition
  • Proposition 6.3: Computational Complexity
  • Remark 7.1: Convergence and Stability
  • Proposition 8.1: FPR Reduction via KL Gating