Table of Contents
Fetching ...

Web Scale Graph Mining for Cyber Threat Intelligence

Scott Freitas, Amir Gharib

TL;DR

Titan, an industry-scale graph mining framework that generates cyber threat intelligence at unprecedented speed and scale, is introduced, an industry-scale graph mining framework that generates cyber threat intelligence at unprecedented speed and scale.

Abstract

Defending against today's increasingly sophisticated and large-scale cyberattacks demands accurate, real-time threat intelligence. Traditional approaches struggle to scale, integrate diverse telemetry, and adapt to a constantly evolving security landscape. We introduce Threat Intelligence Tracking via Adaptive Networks (TITAN), an industry-scale graph mining framework that generates cyber threat intelligence at unprecedented speed and scale. TITAN introduces a suite of innovations specifically designed to address the complexities of the modern security landscape, including: (1) a dynamic threat intelligence graph that maps the intricate relationships between millions of entities, incidents, and organizations; (2) real-time update mechanisms that automatically decay and prune outdated intel; (3) integration of security domain knowledge to bootstrap initial reputation scores; and (4) reputation propagation algorithms that uncover hidden threat actor infrastructure. Integrated into Microsoft Unified Security Operations Platform (USOP), which is deployed across hundreds of thousands of organizations worldwide, TITAN's threat intelligence powers key detection and disruption capabilities. With an impressive average macro-F1 score of 0.89 and a precision-recall AUC of 0.94, TITAN identifies millions of high-risk entities each week, enabling a 6x increase in non-file threat intelligence. Since its deployment, TITAN has increased the product's incident disruption rate by a remarkable 21%, while reducing the time to disrupt by a factor of 1.9x, and maintaining 99% precision, as confirmed by customer feedback and thorough manual evaluation by security experts--ultimately saving customers from costly security breaches.

Web Scale Graph Mining for Cyber Threat Intelligence

TL;DR

Titan, an industry-scale graph mining framework that generates cyber threat intelligence at unprecedented speed and scale, is introduced, an industry-scale graph mining framework that generates cyber threat intelligence at unprecedented speed and scale.

Abstract

Defending against today's increasingly sophisticated and large-scale cyberattacks demands accurate, real-time threat intelligence. Traditional approaches struggle to scale, integrate diverse telemetry, and adapt to a constantly evolving security landscape. We introduce Threat Intelligence Tracking via Adaptive Networks (TITAN), an industry-scale graph mining framework that generates cyber threat intelligence at unprecedented speed and scale. TITAN introduces a suite of innovations specifically designed to address the complexities of the modern security landscape, including: (1) a dynamic threat intelligence graph that maps the intricate relationships between millions of entities, incidents, and organizations; (2) real-time update mechanisms that automatically decay and prune outdated intel; (3) integration of security domain knowledge to bootstrap initial reputation scores; and (4) reputation propagation algorithms that uncover hidden threat actor infrastructure. Integrated into Microsoft Unified Security Operations Platform (USOP), which is deployed across hundreds of thousands of organizations worldwide, TITAN's threat intelligence powers key detection and disruption capabilities. With an impressive average macro-F1 score of 0.89 and a precision-recall AUC of 0.94, TITAN identifies millions of high-risk entities each week, enabling a 6x increase in non-file threat intelligence. Since its deployment, TITAN has increased the product's incident disruption rate by a remarkable 21%, while reducing the time to disrupt by a factor of 1.9x, and maintaining 99% precision, as confirmed by customer feedback and thorough manual evaluation by security experts--ultimately saving customers from costly security breaches.

Paper Structure

This paper contains 22 sections, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Overview of the TITAN architecture: an industry-scale graph mining framework that generates real-time TI by propagating reputation scores across millions of interconnected entities, incidents, and organizations. Built on a time evolving 5-partite graph, the system operates through four key components: (1) dynamic graph construction and updates, (2) integration of known TI and security domain knowledge to bootstrap reputation scores for unknown entities; (3) reputation propagation to iteratively update risk scores; and (4) model calibration to probabilistically align scores for use by security analysts.
  • Figure 2: Graph size and performance over a $7$ day period. The left y-axis plots the number of nodes, nodes in the largest connected component (LCC), and edges. The right y-axis captures the average macro-F1 score across all entities (global).
  • Figure 3: Reputation propagation distributes scores to neighboring entities, reducing unknowns. Calibration smooths the distribution and reduces the frequency of ambiguous mid-range scores
  • Figure 4: Convergence of label propagation measured by the average Frobenius norm residual across 466 runs.
  • Figure 5: Precision-recall curves for each entity type in Region 1 with at least 100 ground-truth examples. The global micro PR curve represents the average detection performance across all entities.
  • ...and 1 more figures