Table of Contents
Fetching ...

LLMalMorph: On The Feasibility of Generating Variant Malware using Large-Language-Models

Md Ajwad Akil, Adrian Shuai Li, Imtiaz Karim, Arun Iyengar, Ashish Kundu, Vinny Parla, Elisa Bertino

TL;DR

The paper investigates whether off-the-shelf Large Language Models can be steered, without fine-tuning, to generate functional malware variants from source code. It introduces LLMalMorph, a two-module framework that uses function-level AST extraction and six prompt-guided code transformations to produce diverse, compilable variants while preserving core semantics; a human-in-the-loop aids debugging. The authors generate 618 variants from 10 Windows malware samples and show meaningful antivirus evasion (up to 15% reduction on VirusTotal and 8–13% on Hybrid Analysis) and high attack success rates (up to 91% against ML detectors), with semantic preservation in a majority of evasive variants. The work highlights practical lessons on prompt design, transformation strategy, and the trade-offs between manual effort and evasion efficacy, and it discusses responsible disclosure and ethics given the dual-use nature of such technology. Overall, LLMalMorph demonstrates the feasibility and risks of source-code–level, LLM-guided malware variant generation and motivates defensive research and governance around AI-assisted cyber threats.

Abstract

Large Language Models (LLMs) have transformed software development and automated code generation. Motivated by these advancements, this paper explores the feasibility of LLMs in modifying malware source code to generate variants. We introduce LLMalMorph, a semi-automated framework that leverages semantical and syntactical code comprehension by LLMs to generate new malware variants. LLMalMorph extracts function-level information from the malware source code and employs custom-engineered prompts coupled with strategically defined code transformations to guide the LLM in generating variants without resource-intensive fine-tuning. To evaluate LLMalMorph, we collected 10 diverse Windows malware samples of varying types, complexity and functionality and generated 618 variants. Our experiments demonstrate that LLMalMorph variants can effectively evade antivirus engines, achieving typical detection rate reductions of 10-15% across multiple complex samples. Furthermore, without explicitly targeting learning-based detectors, LLMalMorph attained attack success rates of up to 91% against a Machine Learning (ML) based malware detector. We also discuss the limitations of current LLM capabilities in generating malware variants from source code and assess where this emerging technology stands in the broader context of malware variant generation.

LLMalMorph: On The Feasibility of Generating Variant Malware using Large-Language-Models

TL;DR

The paper investigates whether off-the-shelf Large Language Models can be steered, without fine-tuning, to generate functional malware variants from source code. It introduces LLMalMorph, a two-module framework that uses function-level AST extraction and six prompt-guided code transformations to produce diverse, compilable variants while preserving core semantics; a human-in-the-loop aids debugging. The authors generate 618 variants from 10 Windows malware samples and show meaningful antivirus evasion (up to 15% reduction on VirusTotal and 8–13% on Hybrid Analysis) and high attack success rates (up to 91% against ML detectors), with semantic preservation in a majority of evasive variants. The work highlights practical lessons on prompt design, transformation strategy, and the trade-offs between manual effort and evasion efficacy, and it discusses responsible disclosure and ethics given the dual-use nature of such technology. Overall, LLMalMorph demonstrates the feasibility and risks of source-code–level, LLM-guided malware variant generation and motivates defensive research and governance around AI-assisted cyber threats.

Abstract

Large Language Models (LLMs) have transformed software development and automated code generation. Motivated by these advancements, this paper explores the feasibility of LLMs in modifying malware source code to generate variants. We introduce LLMalMorph, a semi-automated framework that leverages semantical and syntactical code comprehension by LLMs to generate new malware variants. LLMalMorph extracts function-level information from the malware source code and employs custom-engineered prompts coupled with strategically defined code transformations to guide the LLM in generating variants without resource-intensive fine-tuning. To evaluate LLMalMorph, we collected 10 diverse Windows malware samples of varying types, complexity and functionality and generated 618 variants. Our experiments demonstrate that LLMalMorph variants can effectively evade antivirus engines, achieving typical detection rate reductions of 10-15% across multiple complex samples. Furthermore, without explicitly targeting learning-based detectors, LLMalMorph attained attack success rates of up to 91% against a Machine Learning (ML) based malware detector. We also discuss the limitations of current LLM capabilities in generating malware variants from source code and assess where this emerging technology stands in the broader context of malware variant generation.

Paper Structure

This paper contains 36 sections, 1 equation, 3 figures, 7 tables, 3 algorithms.

Figures (3)

  • Figure 1: Overall Architecture of $\mathsf{LLMalMorph}$. The framework is organized into two main modules. Function Mutator extracts functions from the malware source code file and modifies them using an LLM. Variant Synthesizer updates the malware source code with the modified function and compiles the project to generate the variant.
  • Figure 2: Comparison of detection rates for different strategies across VirusTotal and Hybrid Analysis for ten malware samples.
  • Figure 3: Average Human Effort per variant across strategies.