Table of Contents
Fetching ...

SPADE: Enhancing Adaptive Cyber Deception Strategies with Generative AI and Structured Prompt Engineering

Shihab Ahmed, A B M Mohaimenur Rahman, Md Morshed Alam, Md Sajidul Islam Sajid

TL;DR

This work introduces SPADE, a structured prompt engineering framework designed to enable Generative AI to autonomously generate adaptive cyber deception ploys in response to evolving malware. By formalizing six prompt components and a three-stage process, SPADE aligns GenAI outputs with threat context, operational constraints, and deployment needs, addressing issues of genericity, ambiguity, and scalability. Across multiple GenAI models, including ChatGPT-4o, Gemini, and Llama3.2, the study demonstrates that structured PE markedly improves relevance, actionability, and realism, with ChatGPT-4o delivering the best overall performance in engagement and accuracy and minimal refinement. The findings highlight GenAI's potential to automate scalable, real-time deception strategies and underscore the critical role of disciplined prompt design for practical cybersecurity applications.

Abstract

The rapid evolution of modern malware presents significant challenges to the development of effective defense mechanisms. Traditional cyber deception techniques often rely on static or manually configured parameters, limiting their adaptability to dynamic and sophisticated threats. This study leverages Generative AI (GenAI) models to automate the creation of adaptive cyber deception ploys, focusing on structured prompt engineering (PE) to enhance relevance, actionability, and deployability. We introduce a systematic framework (SPADE) to address inherent challenges large language models (LLMs) pose to adaptive deceptions, including generalized outputs, ambiguity, under-utilization of contextual information, and scalability constraints. Evaluations across diverse malware scenarios using metrics such as Recall, Exact Match (EM), BLEU Score, and expert quality assessments identified ChatGPT-4o as the top performer. Additionally, it achieved high engagement (93%) and accuracy (96%) with minimal refinements. Gemini and ChatGPT-4o Mini demonstrated competitive performance, with Llama3.2 showing promise despite requiring further optimization. These findings highlight the transformative potential of GenAI in automating scalable, adaptive deception strategies and underscore the critical role of structured PE in advancing real-world cybersecurity applications.

SPADE: Enhancing Adaptive Cyber Deception Strategies with Generative AI and Structured Prompt Engineering

TL;DR

This work introduces SPADE, a structured prompt engineering framework designed to enable Generative AI to autonomously generate adaptive cyber deception ploys in response to evolving malware. By formalizing six prompt components and a three-stage process, SPADE aligns GenAI outputs with threat context, operational constraints, and deployment needs, addressing issues of genericity, ambiguity, and scalability. Across multiple GenAI models, including ChatGPT-4o, Gemini, and Llama3.2, the study demonstrates that structured PE markedly improves relevance, actionability, and realism, with ChatGPT-4o delivering the best overall performance in engagement and accuracy and minimal refinement. The findings highlight GenAI's potential to automate scalable, real-time deception strategies and underscore the critical role of disciplined prompt design for practical cybersecurity applications.

Abstract

The rapid evolution of modern malware presents significant challenges to the development of effective defense mechanisms. Traditional cyber deception techniques often rely on static or manually configured parameters, limiting their adaptability to dynamic and sophisticated threats. This study leverages Generative AI (GenAI) models to automate the creation of adaptive cyber deception ploys, focusing on structured prompt engineering (PE) to enhance relevance, actionability, and deployability. We introduce a systematic framework (SPADE) to address inherent challenges large language models (LLMs) pose to adaptive deceptions, including generalized outputs, ambiguity, under-utilization of contextual information, and scalability constraints. Evaluations across diverse malware scenarios using metrics such as Recall, Exact Match (EM), BLEU Score, and expert quality assessments identified ChatGPT-4o as the top performer. Additionally, it achieved high engagement (93%) and accuracy (96%) with minimal refinements. Gemini and ChatGPT-4o Mini demonstrated competitive performance, with Llama3.2 showing promise despite requiring further optimization. These findings highlight the transformative potential of GenAI in automating scalable, adaptive deception strategies and underscore the critical role of structured PE in advancing real-world cybersecurity applications.
Paper Structure (14 sections, 1 figure, 4 tables)