Table of Contents
Fetching ...

Facade: High-Precision Insider Threat Detection Using Deep Contextual Anomaly Detection

Alex Kantchelian, Casper Neo, Ryan Stevens, Hyungwon Kim, Zhaohao Fu, Sadegh Momeni, Birkett Huber, Elie Bursztein, Yanis Pavlidis, Senaka Buthpitiya, Martin Cochran, Massimiliano Poletto

TL;DR

Facade tackles insider-threat detection by reframing anomaly scoring at the single-event level through a context-action decomposition and a contrastive, self-supervised training regime. It introduces positive sampling to enable learning from benign logs, leverages historical access-based and implicit social network features to generalize to new users/resources, and uses a clustering-based aggregator to achieve high precision with minimal audits. Empirical evaluation in a real-world, large-scale Google environment shows extremely low false positive rates (below $0.01\%$) and effective detection of simulated insider attacks, with robustness to distribution shifts and label noise. The approach provides a practical, scalable defense-in-depth mechanism and suggests avenues for proactive security enhancements and finer-grained access control.

Abstract

We present Facade (Fast and Accurate Contextual Anomaly DEtection): a high-precision deep-learning-based anomaly detection system deployed at Google (a large technology company) as the last line of defense against insider threats since 2018. Facade is an innovative unsupervised action-context system that detects suspicious actions by considering the context surrounding each action, including relevant facts about the user and other entities involved. It is built around a new multi-modal model that is trained on corporate document access, SQL query, and HTTP/RPC request logs. To overcome the scarcity of incident data, Facade harnesses a novel contrastive learning strategy that relies solely on benign data. Its use of history and implicit social network featurization efficiently handles the frequent out-of-distribution events that occur in a rapidly changing corporate environment, and sustains Facade's high precision performance for a full year after training. Beyond the core model, Facade contributes an innovative clustering approach based on user and action embeddings to improve detection robustness and achieve high precision, multi-scale detection. Functionally what sets Facade apart from existing anomaly detection systems is its high precision. It detects insider attackers with an extremely low false positive rate, lower than 0.01%. For single rogue actions, such as the illegitimate access to a sensitive document, the false positive rate is as low as 0.0003%. To the best of our knowledge, Facade is the only published insider risk anomaly detection system that helps secure such a large corporate environment.

Facade: High-Precision Insider Threat Detection Using Deep Contextual Anomaly Detection

TL;DR

Facade tackles insider-threat detection by reframing anomaly scoring at the single-event level through a context-action decomposition and a contrastive, self-supervised training regime. It introduces positive sampling to enable learning from benign logs, leverages historical access-based and implicit social network features to generalize to new users/resources, and uses a clustering-based aggregator to achieve high precision with minimal audits. Empirical evaluation in a real-world, large-scale Google environment shows extremely low false positive rates (below ) and effective detection of simulated insider attacks, with robustness to distribution shifts and label noise. The approach provides a practical, scalable defense-in-depth mechanism and suggests avenues for proactive security enhancements and finer-grained access control.

Abstract

We present Facade (Fast and Accurate Contextual Anomaly DEtection): a high-precision deep-learning-based anomaly detection system deployed at Google (a large technology company) as the last line of defense against insider threats since 2018. Facade is an innovative unsupervised action-context system that detects suspicious actions by considering the context surrounding each action, including relevant facts about the user and other entities involved. It is built around a new multi-modal model that is trained on corporate document access, SQL query, and HTTP/RPC request logs. To overcome the scarcity of incident data, Facade harnesses a novel contrastive learning strategy that relies solely on benign data. Its use of history and implicit social network featurization efficiently handles the frequent out-of-distribution events that occur in a rapidly changing corporate environment, and sustains Facade's high precision performance for a full year after training. Beyond the core model, Facade contributes an innovative clustering approach based on user and action embeddings to improve detection robustness and achieve high precision, multi-scale detection. Functionally what sets Facade apart from existing anomaly detection systems is its high precision. It detects insider attackers with an extremely low false positive rate, lower than 0.01%. For single rogue actions, such as the illegitimate access to a sensitive document, the false positive rate is as low as 0.0003%. To the best of our knowledge, Facade is the only published insider risk anomaly detection system that helps secure such a large corporate environment.

Paper Structure

This paper contains 51 sections, 1 theorem, 20 equations, 9 figures, 2 tables.

Key Result

Theorem 1

Let $\mathbb A = \{1,..., n_a\}$, $\mathbb C=\{1,...,n_c\}$, and $P$ be probability distribution over $\mathbb A\times \mathbb C$. Let $\ell: \mathbb{R} \rightarrow \mathbb{R}$ be a decreasing, strictly convex differentiable function where $\ell'(t)\rightarrow 0$ at infinity. A binary classifier $f: where we used the following notational shorthands to denote the marginals of actions and contexts:

Figures (9)

  • Figure 1: Facade system overview at inference time. Light gray rectangles represent fixed computational processes, cylinders are data sets, and trapezoids are (parts of) learned ML models. Data snippets are provided in dashed boxes at various points of the pipeline. Blue bookmarks reference the relevant paper sections.
  • Figure 2: Facade system at training time. Facade employs a self-supervised, contrastive training strategy: the model is optimized to differentiate between natural and mismatched pairs of action-contexts.
  • Figure 3: Facade tower architecture. Green rounded shapes denote transformations, and rectangles denote concrete values. The subarchitecture that reduces a variable-length weighted token set feature to a fixed-size intermediary representation is expanded. By convention, our final embeddings are non-negative and L2 normalized.
  • Figure 4: History-based action featurization. Timestamp and access modality are not featurized. Any action is represented by the weighted set of all previous accessors, where the weights are proportional to the frequency of accesses and sum to 1.
  • Figure 5: Validation setup for tuning model architecture and hyperparameters. To properly assess model generalization performance in the presence of temporal distribution shift and of entities unseen at training time, we separate the training and testing sets in both time and "space" dimensions.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Theorem 1: positive sampling + pointwise loss $\Rightarrow$ lift
  • proof