Table of Contents
Fetching ...

Risk-Calibrated Bayesian Streaming Intrusion Detection with SRE-Aligned Decisions

Michel Youssef

TL;DR

This work addresses intrusion detection under imbalanced, drifting data by marrying Bayesian Online Changepoint Detection with SRE-aligned cost thresholds. By modeling run-lengths and using a cost-sensitive decision rule, the approach yields calibrated, actionable alerts that respect operational error budgets. Empirical results on UNSW-NB15 and CICIDS2017 show improved precision-recall at higher recall levels and better probability calibration compared with unsupervised baselines, supported by calibration diagrams and latency analyses. The framework offers practical deployment potential for enterprise telemetry, with reproducibility materials and future directions including live deployment and deeper feature integration.

Abstract

We present a risk-calibrated approach to streaming intrusion detection that couples Bayesian Online Changepoint Detection (BOCPD) with decision thresholds aligned to Site Reliability Engineering (SRE) error budgets. BOCPD provides run-length posteriors that adapt to distribution shift and concept drift; we map these posteriors to alert decisions by optimizing expected operational cost under false-positive and false-negative budgets. We detail the hazard model, conjugate updates, and an O(1)-per-event implementation. A concrete SRE example shows how a 99.9% availability SLO (43.2 minutes per month error budget) yields a probability threshold near 0.91 when missed incidents are 10x more costly than false alarms. We evaluate on the full UNSW-NB15 and CIC-IDS2017 benchmarks with chronological splits, comparing against strong unsupervised baselines (ECOD, COPOD, and LOF). Metrics include PR-AUC, ROC-AUC, Brier score, calibration reliability diagrams, and detection latency measured in events. Results indicate improved precision-recall at mid to high recall and better probability calibration relative to baselines. We release implementation details, hyperparameters, and ablations for hazard sensitivity and computational footprint. Code and reproducibility materials will be made available upon publication; datasets and implementation are available from the corresponding author upon reasonable request.

Risk-Calibrated Bayesian Streaming Intrusion Detection with SRE-Aligned Decisions

TL;DR

This work addresses intrusion detection under imbalanced, drifting data by marrying Bayesian Online Changepoint Detection with SRE-aligned cost thresholds. By modeling run-lengths and using a cost-sensitive decision rule, the approach yields calibrated, actionable alerts that respect operational error budgets. Empirical results on UNSW-NB15 and CICIDS2017 show improved precision-recall at higher recall levels and better probability calibration compared with unsupervised baselines, supported by calibration diagrams and latency analyses. The framework offers practical deployment potential for enterprise telemetry, with reproducibility materials and future directions including live deployment and deeper feature integration.

Abstract

We present a risk-calibrated approach to streaming intrusion detection that couples Bayesian Online Changepoint Detection (BOCPD) with decision thresholds aligned to Site Reliability Engineering (SRE) error budgets. BOCPD provides run-length posteriors that adapt to distribution shift and concept drift; we map these posteriors to alert decisions by optimizing expected operational cost under false-positive and false-negative budgets. We detail the hazard model, conjugate updates, and an O(1)-per-event implementation. A concrete SRE example shows how a 99.9% availability SLO (43.2 minutes per month error budget) yields a probability threshold near 0.91 when missed incidents are 10x more costly than false alarms. We evaluate on the full UNSW-NB15 and CIC-IDS2017 benchmarks with chronological splits, comparing against strong unsupervised baselines (ECOD, COPOD, and LOF). Metrics include PR-AUC, ROC-AUC, Brier score, calibration reliability diagrams, and detection latency measured in events. Results indicate improved precision-recall at mid to high recall and better probability calibration relative to baselines. We release implementation details, hyperparameters, and ablations for hazard sensitivity and computational footprint. Code and reproducibility materials will be made available upon publication; datasets and implementation are available from the corresponding author upon reasonable request.

Paper Structure

This paper contains 14 sections, 3 equations, 7 figures.

Figures (7)

  • Figure 1: Precision--recall curve on the UNSW--NB15 stream. Our method maintains high precision across recall levels.
  • Figure 2: Precision--recall curve on the CICIDS2017 stream. The risk--calibrated detector outperforms unsupervised baselines.
  • Figure 3: ROC curve on the UNSW--NB15 stream. All detectors achieve high AUC but PR metrics reveal differences under imbalance.
  • Figure 4: ROC curve on the CICIDS2017 stream.
  • Figure 5: Reliability diagram for the UNSW--NB15 stream. The dashed line denotes perfect calibration and our probabilities lie close to this diagonal.
  • ...and 2 more figures