Table of Contents
Fetching ...

From One Attack Domain to Another: Contrastive Transfer Learning with Siamese Networks for APT Detection

Sidahmed Benabderrahmane, Talal Rahwan

TL;DR

The paper addresses cross‑domain APT detection under data scarcity and heterogeneous feature spaces by proposing a hybrid transfer framework that combines an attention‑based autoencoder (AAE), SHAP/entropy‑driven feature selection, contrastive learning, and Siamese network alignment, augmented with synthetic data from cGANs and VAEs. This integrated approach aims to improve anomaly ranking (via nDCG) and binary detection (via AUC) across multiple OS platforms, while maintaining explainability and scalability for SOC deployments. Empirical results on DARPA TC traces demonstrate statistically significant gains over classical and deep baselines across BSD, Windows, Linux, and Android, with ablations confirming the critical roles of feature selection and cross‑domain alignment. The work further enhances interpretability by mapping detected behaviors to MITRE ATT&CK techniques and providing LLM‑assisted natural language explanations, facilitating analyst decision‑making in real‑world settings.

Abstract

Advanced Persistent Threats (APT) pose a major cybersecurity challenge due to their stealth, persistence, and adaptability. Traditional machine learning detectors struggle with class imbalance, high dimensional features, and scarce real world traces. They often lack transferability-performing well in the training domain but degrading in novel attack scenarios. We propose a hybrid transfer framework that integrates Transfer Learning, Explainable AI (XAI), contrastive learning, and Siamese networks to improve cross-domain generalization. An attention-based autoencoder supports knowledge transfer across domains, while Shapley Additive exPlanations (SHAP) select stable, informative features to reduce dimensionality and computational cost. A Siamese encoder trained with a contrastive objective aligns source and target representations, increasing anomaly separability and mitigating feature drift. We evaluate on real-world traces from the DARPA Transparent Computing (TC) program and augment with synthetic attack scenarios to test robustness. Across source to target transfers, the approach delivers improved detection scores with classical and deep baselines, demonstrating a scalable, explainable, and transferable solution for APT detection.

From One Attack Domain to Another: Contrastive Transfer Learning with Siamese Networks for APT Detection

TL;DR

The paper addresses cross‑domain APT detection under data scarcity and heterogeneous feature spaces by proposing a hybrid transfer framework that combines an attention‑based autoencoder (AAE), SHAP/entropy‑driven feature selection, contrastive learning, and Siamese network alignment, augmented with synthetic data from cGANs and VAEs. This integrated approach aims to improve anomaly ranking (via nDCG) and binary detection (via AUC) across multiple OS platforms, while maintaining explainability and scalability for SOC deployments. Empirical results on DARPA TC traces demonstrate statistically significant gains over classical and deep baselines across BSD, Windows, Linux, and Android, with ablations confirming the critical roles of feature selection and cross‑domain alignment. The work further enhances interpretability by mapping detected behaviors to MITRE ATT&CK techniques and providing LLM‑assisted natural language explanations, facilitating analyst decision‑making in real‑world settings.

Abstract

Advanced Persistent Threats (APT) pose a major cybersecurity challenge due to their stealth, persistence, and adaptability. Traditional machine learning detectors struggle with class imbalance, high dimensional features, and scarce real world traces. They often lack transferability-performing well in the training domain but degrading in novel attack scenarios. We propose a hybrid transfer framework that integrates Transfer Learning, Explainable AI (XAI), contrastive learning, and Siamese networks to improve cross-domain generalization. An attention-based autoencoder supports knowledge transfer across domains, while Shapley Additive exPlanations (SHAP) select stable, informative features to reduce dimensionality and computational cost. A Siamese encoder trained with a contrastive objective aligns source and target representations, increasing anomaly separability and mitigating feature drift. We evaluate on real-world traces from the DARPA Transparent Computing (TC) program and augment with synthetic attack scenarios to test robustness. Across source to target transfers, the approach delivers improved detection scores with classical and deep baselines, demonstrating a scalable, explainable, and transferable solution for APT detection.

Paper Structure

This paper contains 55 sections, 26 equations, 14 figures, 8 tables, 1 algorithm.

Figures (14)

  • Figure 1: Pipeline for Transfer Learning in APT Detection: The framework begins with provenance graph databases and explainable AI (XAI) feature selection to extract the most informative features. An autoencoder is pretrained on source attack data to learn latent representations, which are then transferred to the target domain. Feature space refinement is achieved through contrastive learning, ensuring better separation between positive and negative pairs. A Siamese network further aligns source and target embeddings by minimizing feature space discrepancies. The final transformed embeddings are used for anomaly detection, producing anomaly scores that enhance cybersecurity threat identification.
  • Figure 2: Principle of contrastive learning: For an anchor process $r$ (yellow), source and target positives (orange x and blue $\blacksquare$) are pulled closer while negatives (green ×) are pushed away. Before learning (left), embeddings are dispersed; after optimizing a contrastive loss (right), positives are pulled within a similarity margin (dashed circle) and negatives are driven outside the margin, reducing intra-class distance, increasing inter-class separation, and aligning source–target representations for improved anomaly ranking.
  • Figure 3: Siamese Network Architecture for Transfer Learning in APT Detection. Two input feature vectors are processed through identical neural network branches with shared weights. The resulting embeddings are compared using a similarity function to determine whether the inputs exhibit similar behavior, enabling cross-domain generalization for anomaly or threat detection.
  • Figure 4: Organization of the DARPA's TC datasets. Each OS undergoes two attack scenarios, each of which contains five data aspects ets. With four OS (BSD, Windows, Linux, Android), two attack scenarios, and five aspects (PE, PX, PP, PN, PA), a total of forty individual datasets are composed. Each forensic configuration (OS$\times$attack scenario$\times$data aspect) represents a single dataset DBLP:journals/fgcs/BenabderrahmaneHVCR24.
  • Figure 5: Density plots illustrating cumulative feature importance across real (Scenario 1 and 2) and synthetic attack scenarios (generated via cGANs and VAEs). The vertical dashed red line indicates the cumulative contribution threshold used to retain the most informative features for optimized feature selection and transfer learning. From top to bottom, datasets belong to: Android$\times$PE$\times$E1, Linux$\times$PE$\times$E1, BSD$\times$PE$\times$E1, and Windows$\times$PE$\times$E1.
  • ...and 9 more figures