From One Attack Domain to Another: Contrastive Transfer Learning with Siamese Networks for APT Detection
Sidahmed Benabderrahmane, Talal Rahwan
TL;DR
The paper addresses cross‑domain APT detection under data scarcity and heterogeneous feature spaces by proposing a hybrid transfer framework that combines an attention‑based autoencoder (AAE), SHAP/entropy‑driven feature selection, contrastive learning, and Siamese network alignment, augmented with synthetic data from cGANs and VAEs. This integrated approach aims to improve anomaly ranking (via nDCG) and binary detection (via AUC) across multiple OS platforms, while maintaining explainability and scalability for SOC deployments. Empirical results on DARPA TC traces demonstrate statistically significant gains over classical and deep baselines across BSD, Windows, Linux, and Android, with ablations confirming the critical roles of feature selection and cross‑domain alignment. The work further enhances interpretability by mapping detected behaviors to MITRE ATT&CK techniques and providing LLM‑assisted natural language explanations, facilitating analyst decision‑making in real‑world settings.
Abstract
Advanced Persistent Threats (APT) pose a major cybersecurity challenge due to their stealth, persistence, and adaptability. Traditional machine learning detectors struggle with class imbalance, high dimensional features, and scarce real world traces. They often lack transferability-performing well in the training domain but degrading in novel attack scenarios. We propose a hybrid transfer framework that integrates Transfer Learning, Explainable AI (XAI), contrastive learning, and Siamese networks to improve cross-domain generalization. An attention-based autoencoder supports knowledge transfer across domains, while Shapley Additive exPlanations (SHAP) select stable, informative features to reduce dimensionality and computational cost. A Siamese encoder trained with a contrastive objective aligns source and target representations, increasing anomaly separability and mitigating feature drift. We evaluate on real-world traces from the DARPA Transparent Computing (TC) program and augment with synthetic attack scenarios to test robustness. Across source to target transfers, the approach delivers improved detection scores with classical and deep baselines, demonstrating a scalable, explainable, and transferable solution for APT detection.
