A Sysmon Incremental Learning System for Ransomware Analysis and Detection

Jamil Ispahany; MD Rafiqul Islam; M. Arif Khan; MD Zahidul Islam

A Sysmon Incremental Learning System for Ransomware Analysis and Detection

Jamil Ispahany, MD Rafiqul Islam, M. Arif Khan, MD Zahidul Islam

TL;DR

Ransomware detection faces a training-gap bottleneck as new strains emerge. The authors propose SILRAD, an online incremental learning system that leverages Sysmon logs, fastText embeddings, PCC-based feature selection, Adaptive Random Forest, and ADWIN concept-drift detection to continuously adapt without retraining from scratch. On drift-enabled data, SILRAD achieves near 98.9% accuracy and around 94% MCC, while using less memory and offering faster real-time inference than competing incremental methods. This work demonstrates a practical, drift-aware framework for real-time ransomware analytics that mitigates data exposure during model updates and supports deployment in production-like environments.

Abstract

In the face of increasing cyber threats, particularly ransomware attacks, there is a pressing need for advanced detection and analysis systems that adapt to evolving malware behaviours. Throughout the literature, using machine learning (ML) to obviate ransomware attacks has increased in popularity. Unfortunately, most of these proposals leverage non-incremental learning approaches that require the underlying models to be updated from scratch to detect new ransomware, wasting time and resources. This approach is problematic because it leaves sensitive data vulnerable to attack during retraining, as newly emerging ransomware strains may go undetected until the model is updated. Furthermore, most of these approaches are not designed to detect ransomware in real-time data streams, limiting their effectiveness in complex network environments. To address this challenge, we present the Sysmon Incremental Learning System for Ransomware Analysis and Detection (SILRAD), which enables continuous updates to the underlying model and effectively closes the training gap. By leveraging the capabilities of Sysmon for detailed monitoring of system activities, our approach integrates online incremental learning techniques to enhance the adaptability and efficiency of ransomware detection. The most valuable features for detection were selected using the Pearson Correlation Coefficient (PCC), and concept drift detection was implemented through the ADWIN algorithm, ensuring that the model remains responsive to changes in ransomware behaviour. We compared our results to other popular techniques, such as Hoeffding Trees (HT) and Leveraging Bagging Classifier (LB), observing a detection accuracy of 98.89% and a Matthews Correlation Coefficient (MCC) rate of 94.11%, demonstrating the effectiveness of our technique.

A Sysmon Incremental Learning System for Ransomware Analysis and Detection

TL;DR

Abstract

Paper Structure (24 sections, 16 equations, 10 figures, 4 tables)

This paper contains 24 sections, 16 equations, 10 figures, 4 tables.

Introduction
Related work
Ransomware behaviour and detection methods
Ransomware detection using online incremental learning
Leveraging Sysmon logs for ransomware detection
Proposed approach
Dynamic analysis using Sysmon
Feature representation using fastText
Feature selection technique
Online incremental learning
Concept drift detection
Experiment
Experiment setup
SILRAD environment
Network considerations
...and 9 more sections

Figures (10)

Figure 1: The Proposed Sysmon Incremental Learning System for Ransomware Analysis and Detection (SILRAD). The blue arrows indicate the data stream used to train the model, and the red arrows indicate system activity to be classified
Figure 2: fastText conversion to vectors using n-grams. The above example shows the process to convert the word "explain" into vectors where $v_n$ represents the vector representation of the corresponding $n$-gram
Figure 3: The most significant Sysmon features calculated by the Pearson Correlation Coefficient (PCC)
Figure 4: The online learning model used by SILRAD whereby the model is trained and predictions made per instance of data arriving
Figure 5: The experiment setup used to detonate ransomware and harvest features
...and 5 more figures

A Sysmon Incremental Learning System for Ransomware Analysis and Detection

TL;DR

Abstract

A Sysmon Incremental Learning System for Ransomware Analysis and Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (10)