Table of Contents
Fetching ...

ThreatModeling-LLM: Automating Threat Modeling using Large Language Models for Banking System

Tingmin Wu, Shuiqiao Yang, Shigang Liu, David Nguyen, Seung Jang, Alsharif Abuadbba

TL;DR

This work tackles automation of threat modeling in the banking domain by leveraging LLMs through a three-stage ThreatModeling-LLM framework: dataset creation using the Microsoft Threat Modeling Tool, prompt engineering with Chain-of-Thought and Optimization by Prompting, and domain-specific fine-tuning via Low-Rank Adaptation. The authors build a 50-sample banking dataset, map identified threats to NIST 800-53 controls, and demonstrate that a combined CoT+OPRO prompting strategy plus LoRA fine-tuning yields substantial performance gains over baselines, with notable improvements in threat identification accuracy, mitigation precision, and alignment with compliance codes. Experiments across Llama-3.1-8B and GPT-3.5-turbo show that fine-tuned small models can outperform larger pre-trained counterparts in this context, achieving high text similarity to ground truth and robust NIST code mappings. The results indicate strong potential for practical deployment in banking cybersecurity, enabling automated, compliant threat modeling with reduced human effort, while highlighting future work on cross-domain generalization and efficiency for scaling to larger models.

Abstract

Threat modeling is a crucial component of cybersecurity, particularly for industries such as banking, where the security of financial data is paramount. Traditional threat modeling approaches require expert intervention and manual effort, often leading to inefficiencies and human error. The advent of Large Language Models (LLMs) offers a promising avenue for automating these processes, enhancing both efficiency and efficacy. However, this transition is not straightforward due to three main challenges: (1) the lack of publicly available, domain-specific datasets, (2) the need for tailored models to handle complex banking system architectures, and (3) the requirement for real-time, adaptive mitigation strategies that align with compliance standards like NIST 800-53. In this paper, we introduce ThreatModeling-LLM, a novel and adaptable framework that automates threat modeling for banking systems using LLMs. ThreatModeling-LLM operates in three stages: 1) dataset creation, 2) prompt engineering and 3) model fine-tuning. We first generate a benchmark dataset using Microsoft Threat Modeling Tool (TMT). Then, we apply Chain of Thought (CoT) and Optimization by PROmpting (OPRO) on the pre-trained LLMs to optimize the initial prompt. Lastly, we fine-tune the LLM using Low-Rank Adaptation (LoRA) based on the benchmark dataset and the optimized prompt to improve the threat identification and mitigation generation capabilities of pre-trained LLMs.

ThreatModeling-LLM: Automating Threat Modeling using Large Language Models for Banking System

TL;DR

This work tackles automation of threat modeling in the banking domain by leveraging LLMs through a three-stage ThreatModeling-LLM framework: dataset creation using the Microsoft Threat Modeling Tool, prompt engineering with Chain-of-Thought and Optimization by Prompting, and domain-specific fine-tuning via Low-Rank Adaptation. The authors build a 50-sample banking dataset, map identified threats to NIST 800-53 controls, and demonstrate that a combined CoT+OPRO prompting strategy plus LoRA fine-tuning yields substantial performance gains over baselines, with notable improvements in threat identification accuracy, mitigation precision, and alignment with compliance codes. Experiments across Llama-3.1-8B and GPT-3.5-turbo show that fine-tuned small models can outperform larger pre-trained counterparts in this context, achieving high text similarity to ground truth and robust NIST code mappings. The results indicate strong potential for practical deployment in banking cybersecurity, enabling automated, compliant threat modeling with reduced human effort, while highlighting future work on cross-domain generalization and efficiency for scaling to larger models.

Abstract

Threat modeling is a crucial component of cybersecurity, particularly for industries such as banking, where the security of financial data is paramount. Traditional threat modeling approaches require expert intervention and manual effort, often leading to inefficiencies and human error. The advent of Large Language Models (LLMs) offers a promising avenue for automating these processes, enhancing both efficiency and efficacy. However, this transition is not straightforward due to three main challenges: (1) the lack of publicly available, domain-specific datasets, (2) the need for tailored models to handle complex banking system architectures, and (3) the requirement for real-time, adaptive mitigation strategies that align with compliance standards like NIST 800-53. In this paper, we introduce ThreatModeling-LLM, a novel and adaptable framework that automates threat modeling for banking systems using LLMs. ThreatModeling-LLM operates in three stages: 1) dataset creation, 2) prompt engineering and 3) model fine-tuning. We first generate a benchmark dataset using Microsoft Threat Modeling Tool (TMT). Then, we apply Chain of Thought (CoT) and Optimization by PROmpting (OPRO) on the pre-trained LLMs to optimize the initial prompt. Lastly, we fine-tune the LLM using Low-Rank Adaptation (LoRA) based on the benchmark dataset and the optimized prompt to improve the threat identification and mitigation generation capabilities of pre-trained LLMs.

Paper Structure

This paper contains 23 sections, 10 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Comparison of Traditional method and LLM-based method. The traditional method (top) requires manual creation of Data Flow Diagrams (DFDs). After threats are identified, additional manual effort is needed to map them to mitigations and code. In contrast, the LLM-based process (bottom) streamlines the workflow by using system descriptions as input to automatically generate threats, corresponding mitigations, and the NIST 800-53 controls.
  • Figure 2: System Overview of ThreatModeling-LLM: (i) Data Creation: Utilizes the Microsoft Threat Modeling Tool to manually generate threat modeling samples, comprising 50 samples verified manually to construct a ground truth dataset. (ii) Prompt Engineering: Involves manually designing the initial prompt for a Large Language Model (LLM), followed by optimizing these prompts to enhance model responses. (iii) Model Fine-tuning: This phase includes the fine-tuning of the threat modeling model using the LLM to improve its accuracy and reliability in threat detection, and mitigation generation (i.e., NIST 800-53 control codes).
  • Figure 3: Dataset Creation Framework.
  • Figure 4: A light example of the Generated DFD.
  • Figure 5: The process of prompt design, combing CoT and Prompt Evolution based on OPRP.
  • ...and 4 more figures