Table of Contents
Fetching ...

Mask-Based Window-Level Insider Threat Detection for Campaign Discovery

Jericho Cain, Hayden Beadles

TL;DR

This paper tackles unsupervised window-level insider threat detection in sparse enterprise audit data and its extension to campaign discovery. It introduces a dual-channel mask-value convolutional autoencoder that separately models activity presence and magnitude, yielding stronger window-level precision–recall performance than standard baselines, with a reported PR-AUC around $0.71$ and potential zero-false-alarm operating points. The authors demonstrate that campaign detection can be effectively achieved by sparse aggregation of high-confidence window-level scores, using simple top-$k$ pooling over six-day sequences to attain PR-AUCs up to approximately $0.835$ (ROC-AUC ≈ $0.938$) without explicit temporal trajectory modeling. Collectively, the work provides a practical, high-precision two-stage approach for insider threat monitoring: precise window-level alerts complemented by scalable campaign discovery through aggregation, with robust performance across multiple attack scenarios on the CERT r4.2 dataset.

Abstract

User and Entity Behavior Analytics (UEBA) systems commonly detect insider threats by scoring fixed time windows of user activity for anomalous behavior. While this window-level paradigm has proven effective for identifying sharp behavioral deviations, it remains unclear how much information about longer-running attack campaigns is already present within individual windows, and how such information can be leveraged for campaign discovery. In this work, we study unsupervised window-level insider threat detection on the CERT r4.2 dataset and show that explicitly separating activity presence from activity magnitude yields substantial performance gains. We introduce a dual-channel convolutional autoencoder that reconstructs both a binary activity mask and corresponding activity values, allowing the model to focus representational capacity on sparse behavioral structure rather than dense inactive baselines. Across multiday attack campaigns lasting between one and seven days, the proposed approach achieves a window-level precision-recall AUC of 0.71, substantially exceeding standard unsupervised autoencoder baselines and enabling high-precision operating points with zero false alarms.

Mask-Based Window-Level Insider Threat Detection for Campaign Discovery

TL;DR

This paper tackles unsupervised window-level insider threat detection in sparse enterprise audit data and its extension to campaign discovery. It introduces a dual-channel mask-value convolutional autoencoder that separately models activity presence and magnitude, yielding stronger window-level precision–recall performance than standard baselines, with a reported PR-AUC around and potential zero-false-alarm operating points. The authors demonstrate that campaign detection can be effectively achieved by sparse aggregation of high-confidence window-level scores, using simple top- pooling over six-day sequences to attain PR-AUCs up to approximately (ROC-AUC ≈ ) without explicit temporal trajectory modeling. Collectively, the work provides a practical, high-precision two-stage approach for insider threat monitoring: precise window-level alerts complemented by scalable campaign discovery through aggregation, with robust performance across multiple attack scenarios on the CERT r4.2 dataset.

Abstract

User and Entity Behavior Analytics (UEBA) systems commonly detect insider threats by scoring fixed time windows of user activity for anomalous behavior. While this window-level paradigm has proven effective for identifying sharp behavioral deviations, it remains unclear how much information about longer-running attack campaigns is already present within individual windows, and how such information can be leveraged for campaign discovery. In this work, we study unsupervised window-level insider threat detection on the CERT r4.2 dataset and show that explicitly separating activity presence from activity magnitude yields substantial performance gains. We introduce a dual-channel convolutional autoencoder that reconstructs both a binary activity mask and corresponding activity values, allowing the model to focus representational capacity on sparse behavioral structure rather than dense inactive baselines. Across multiday attack campaigns lasting between one and seven days, the proposed approach achieves a window-level precision-recall AUC of 0.71, substantially exceeding standard unsupervised autoencoder baselines and enabling high-precision operating points with zero false alarms.
Paper Structure (30 sections, 11 equations, 5 figures, 5 tables)

This paper contains 30 sections, 11 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Dual-channel masked autoencoder for window-level UEBA. Each window is decomposed into a binary activity mask and a value matrix. The encoder maps both channels to a shared latent representation, and the decoder reconstructs both channels. Separate losses are applied to the mask and value outputs; an optional temporal consistency term penalizes latent changes between consecutive normal windows.
  • Figure 2: Window-level precision--recall curves for the proposed mask-based model using 24-hour aggregation, broken down by attack scenario on CERT r4.2. Scenario 1 involves logon and device misuse, while Scenario 3 involves logon and removable media activity. The model achieves strong performance across both scenarios, demonstrating robustness to different attack mechanisms.
  • Figure 3: Overall window-level precision--recall curves for different aggregation intervals on CERT r4.2. The 24-hour window achieves the highest PR-AUC, indicating an optimal balance between temporal context and signal dilution compared to shorter (12-hour) and longer (48-hour) windows.
  • Figure 4: Campaign-level precision--recall curve for Scenario 1 (data exfiltration via email) on CERT r4.2 using top-$k$ aggregation with $k=2$ over six-day sequences. Campaign scores are computed by aggregating window-level mask reconstruction anomalies within each sequence. The method achieves strong campaign detection performance (PR-AUC = 0.816), demonstrating that extended insider attacks in Scenario 1 manifest as a small number of highly anomalous windows rather than sustained deviation.
  • Figure 5: Campaign-level precision--recall curve for Scenario 3 (removable media misuse) on CERT r4.2 using top-$k$ aggregation with $k=2$ over six-day sequences. Despite the small number of attack campaigns, top-$k$ aggregation achieves high detection performance (PR-AUC = 0.794). The sharp transitions in precision reflect the limited number of positive examples and the extreme temporal sparsity of Scenario 3 attacks, where malicious behavior is concentrated in one or two highly anomalous windows.