Table of Contents
Fetching ...

Predicting SSH keys in Open SSH Memory dumps

Florian Rascoussier

TL;DR

This work tackles the problem of predicting SSH keys embedded in OpenSSH memory dumps for digital forensics and attack-monitoring tools. It advances memory-forensics by introducing a memory-graph representation (memgraph) and a Rust-based mem2graph pipeline to convert RAW heap dumps into graph-structured embeddings suitable for ML/DL. The study systematically compares classic ML classifiers and Graph Convolutional Networks across multiple embedding schemes, finding that Node2Vec-based graph embeddings often yield strongest performance despite severe class imbalance, with entropy- and chunk-size based filtering providing substantial data reduction. The results demonstrate the feasibility of memory-graph driven key-prediction and offer a reproducible, open-science workflow, including datasets, code, and documentation, while outlining limitations and avenues for future real-time forensic tooling and broader OpenSSH variants.

Abstract

As the digital landscape evolves, cybersecurity has become an indispensable focus of IT systems. Its ever-escalating challenges have amplified the importance of digital forensics, particularly in the analysis of heap dumps from main memory. In this context, the Secure Shell protocol (SSH) designed for encrypted communications, serves as both a safeguard and a potential veil for malicious activities. This research project focuses on predicting SSH keys in OpenSSH memory dumps, aiming to enhance protective measures against illicit access and enable the development of advanced security frameworks or tools like honeypots. This Masterarbeit is situated within the broader SmartVMI project, and seeks to build upon existing research on key prediction in OpenSSH heap dumps. Utilizing machine learning (ML) and deep learning models, the study aims to refine features for embedding techniques and explore innovative methods for effective key detection based on recent advancements in Knowledge Graph and ML. The objective is to accurately predict the presence and location of SSH keys within memory dumps. This work builds upon, and aims to enhance, the foundations laid by SSHkex and SmartKex, enriching both the methodology and the results of the original research while exploring the untapped potential of newly proposed approaches. The current thesis dives into memory graph modelization from raw binary heap dump files. Each memory graph can support a range of embeddings that can be used directly for model training, through the use of classic ML models and graph neural network. It offers an in-depth discussion on the current state-of-the-art in key prediction for OpenSSH memory dumps, research questions, experimental setups, programs development, results as well as discussing potential future directions.

Predicting SSH keys in Open SSH Memory dumps

TL;DR

This work tackles the problem of predicting SSH keys embedded in OpenSSH memory dumps for digital forensics and attack-monitoring tools. It advances memory-forensics by introducing a memory-graph representation (memgraph) and a Rust-based mem2graph pipeline to convert RAW heap dumps into graph-structured embeddings suitable for ML/DL. The study systematically compares classic ML classifiers and Graph Convolutional Networks across multiple embedding schemes, finding that Node2Vec-based graph embeddings often yield strongest performance despite severe class imbalance, with entropy- and chunk-size based filtering providing substantial data reduction. The results demonstrate the feasibility of memory-graph driven key-prediction and offer a reproducible, open-science workflow, including datasets, code, and documentation, while outlining limitations and avenues for future real-time forensic tooling and broader OpenSSH variants.

Abstract

As the digital landscape evolves, cybersecurity has become an indispensable focus of IT systems. Its ever-escalating challenges have amplified the importance of digital forensics, particularly in the analysis of heap dumps from main memory. In this context, the Secure Shell protocol (SSH) designed for encrypted communications, serves as both a safeguard and a potential veil for malicious activities. This research project focuses on predicting SSH keys in OpenSSH memory dumps, aiming to enhance protective measures against illicit access and enable the development of advanced security frameworks or tools like honeypots. This Masterarbeit is situated within the broader SmartVMI project, and seeks to build upon existing research on key prediction in OpenSSH heap dumps. Utilizing machine learning (ML) and deep learning models, the study aims to refine features for embedding techniques and explore innovative methods for effective key detection based on recent advancements in Knowledge Graph and ML. The objective is to accurately predict the presence and location of SSH keys within memory dumps. This work builds upon, and aims to enhance, the foundations laid by SSHkex and SmartKex, enriching both the methodology and the results of the original research while exploring the untapped potential of newly proposed approaches. The current thesis dives into memory graph modelization from raw binary heap dump files. Each memory graph can support a range of embeddings that can be used directly for model training, through the use of classic ML models and graph neural network. It offers an in-depth discussion on the current state-of-the-art in key prediction for OpenSSH memory dumps, research questions, experimental setups, programs development, results as well as discussing potential future directions.
Paper Structure (202 sections, 2 equations, 22 figures, 17 tables, 12 algorithms)

This paper contains 202 sections, 2 equations, 22 figures, 17 tables, 12 algorithms.

Figures (22)

  • Figure 1: Graphical representation of an rdf:Bag container.
  • Figure 2: Illustration of the Dataset Directory Structure
  • Figure 3: Binary RAW heap dump file loaded using vim and xxd, from /Training/Training/scp/V_7_8_P1/16/1010-1644391327-heap.raw, with highlight on rows with 12 hexadecimal digits followed by 4 zeros.
  • Figure 4: Diagram of an allocated chunk in GLIBC 2.28 MallocGLIBC2001.
  • Figure 5: Diagram of a free chunk in GLIBC 2.28 MallocGLIBC2001.
  • ...and 17 more figures