CyberCScope: Mining Skewed Tensor Streams and Online Anomaly Detection in Cybersecurity Systems
Kota Nakamura, Koki Kawabata, Shungo Tanaka, Yasuko Matsubara, Yasushi Sakurai
TL;DR
The paper tackles real-time anomaly detection in skewed, high-order tensor streams from cybersecurity systems that combine categorical and skewed continuous attributes. It introduces CyberCScope, a streaming framework built on the OP-SiFi decomposition to extract major trends across skewed infinite/finite spaces and to learn multiple time-evolving regimes. A regime-based compact description (C) and MDL-based model compression enable online anomaly scoring that adapts to regime switches, enabling detection of multiple intrusion types with interpretable signatures. Empirical results on large-scale CI'17 and CCI'18 datasets show superior accuracy (e.g., ROC-AUC and PR-AUC) and linear, real-time scalability compared to state-of-the-art baselines, demonstrating practical impact for proactive cybersecurity monitoring.
Abstract
Cybersecurity systems are continuously producing a huge number of time-stamped events in the form of high-order tensors, such as {count; time, port, flow duration, packet size, . . . }, and so how can we detect anomalies/intrusions in real time? How can we identify multiple types of intrusions and capture their characteristic behaviors? The tensor data consists of categorical and continuous attributes and the data distributions of continuous attributes typically exhibit skew. These data properties require handling skewed infinite and finite dimensional spaces simultaneously. In this paper, we propose a novel streaming method, namely CyberCScope. The method effectively decomposes incoming tensors into major trends while explicitly distinguishing between categorical and skewed continuous attributes. To our knowledge, it is the first to compute hybrid skewed infinite and finite dimensional decomposition. Based on this decomposition, it streamingly finds distinct time-evolving patterns, enabling the detection of multiple types of anomalies. Extensive experiments on large-scale real datasets demonstrate that CyberCScope detects various intrusions with higher accuracy than state-of-the-art baselines while providing meaningful summaries for the intrusions that occur in practice.
