ThreMoLIA: Threat Modeling of Large Language Model-Integrated Applications

Felix Viktor Jedrzejewski; Davide Fucci; Oleksandr Adamov

ThreMoLIA: Threat Modeling of Large Language Model-Integrated Applications

Felix Viktor Jedrzejewski, Davide Fucci, Oleksandr Adamov

TL;DR

The paper addresses the risk introduced by Large Language Model-Integrated Applications (LIAs) and proposes ThreMoLIA, an LLM-assisted threat-modeling framework. It employs a composable pipeline that combines Retrieval Augmented Generation (RAG), data aggregation, prompting, and quality assurance to generate and continuously update threat models, grounded in MITRE ATLAS and OWASP Top 10 for LLMs. The work highlights LIA-specific threats such as prompt injection and data poisoning, and outlines data-quality and evaluation challenges, along with an industry-collaborative evaluation plan and a preliminary ChatGPT-based prototype. The practical impact lies in reducing the reliance on security experts, accelerating threat modeling in industrial contexts, and enabling ongoing model risk management throughout the software lifecycle.

Abstract

Large Language Models (LLMs) are currently being integrated into industrial software applications to help users perform more complex tasks in less time. However, these LLM-Integrated Applications (LIA) expand the attack surface and introduce new kinds of threats. Threat modeling is commonly used to identify these threats and suggest mitigations. However, it is a time-consuming practice that requires the involvement of a security practitioner. Our goals are to 1) provide a method for performing threat modeling for LIAs early in their lifecycle, (2) develop a threat modeling tool that integrates existing threat models, and (3) ensure high-quality threat modeling. To achieve the goals, we work in collaboration with our industry partner. Our proposed way of performing threat modeling will benefit industry by requiring fewer security experts' participation and reducing the time spent on this activity. Our proposed tool combines LLMs and Retrieval Augmented Generation (RAG) and uses sources such as existing threat models and application architecture repositories to continuously create and update threat models. We propose to evaluate the tool offline -- i.e., using benchmarking -- and online with practitioners in the field. We conducted an early evaluation using ChatGPT on a simple LIA and obtained results that encouraged us to proceed with our research efforts.

ThreMoLIA: Threat Modeling of Large Language Model-Integrated Applications

TL;DR

Abstract

ThreMoLIA: Threat Modeling of Large Language Model-Integrated Applications

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)