PriMod4AI: Lifecycle-Aware Privacy Threat Modeling for AI Systems using LLM

Gautam Savaliya; Robert Aufschläger; Abhishek Subedi; Michael Heigl; Martin Schramm

PriMod4AI: Lifecycle-Aware Privacy Threat Modeling for AI Systems using LLM

Gautam Savaliya, Robert Aufschläger, Abhishek Subedi, Michael Heigl, Martin Schramm

TL;DR

PriMod4AI addresses the gap in privacy threat modeling for AI systems by unifying classical LINDDUN threats with AI-specific model-centric risks in a lifecycle-aware framework. It leverages two structured knowledge bases (LINDDUN_KB and AI_Privacy_KB), DFD-derived system metadata, and retrieval-augmented prompting to produce justified, taxonomy-grounded threat assessments via open-source LLMs. The approach yields broad LINDDUN coverage and identifies model-centric threats across two realistic use cases, with cross-model agreement indicating robust, reproducible reasoning across GPT-OSS and LLaMA variants. This work advances privacy-by-design in AI by delivering explainable, scalable threat identification grounded in both domain knowledge and system architecture.

Abstract

Artificial intelligence systems introduce complex privacy risks throughout their lifecycle, especially when processing sensitive or high-dimensional data. Beyond the seven traditional privacy threat categories defined by the LINDDUN framework, AI systems are also exposed to model-centric privacy attacks such as membership inference and model inversion, which LINDDUN does not cover. To address both classical LINDDUN threats and additional AI-driven privacy attacks, PriMod4AI introduces a hybrid privacy threat modeling approach that unifies two structured knowledge sources, a LINDDUN knowledge base representing the established taxonomy, and a model-centric privacy attack knowledge base capturing threats outside LINDDUN. These knowledge bases are embedded into a vector database for semantic retrieval and combined with system level metadata derived from Data Flow Diagram. PriMod4AI uses retrieval-augmented and Data Flow specific prompt generation to guide large language models (LLMs) in identifying, explaining, and categorizing privacy threats across lifecycle stages. The framework produces justified and taxonomy-grounded threat assessments that integrate both classical and AI-driven perspectives. Evaluation on two AI systems indicates that PriMod4AI provides broad coverage of classical privacy categories while additionally identifying model-centric privacy threats. The framework produces consistent, knowledge-grounded outputs across LLMs, as reflected in agreement scores in the observed range.

PriMod4AI: Lifecycle-Aware Privacy Threat Modeling for AI Systems using LLM

TL;DR

Abstract

Paper Structure (32 sections, 3 equations, 4 figures, 8 tables, 1 algorithm)

This paper contains 32 sections, 3 equations, 4 figures, 8 tables, 1 algorithm.

Introduction
Related Works
Privacy Threat Modeling
AI-Specific Privacy Risks and Taxonomies
Automated Threat Identification with LLMs
Retrieval-Augmented Generation in Threat Modeling
Proposed Method
Knowledge Base Construction
DFD Representation of the AI System
Retrieval-Augmented Prompt Generation
Base Prompt Template
Per-DF Prompt Construction
Retrieval-Augmented Generation
Open-Source LLM Integration
Structured Output Generation
...and 17 more sections

Figures (4)

Figure 1: A six-phase AI development lifecycle encompassing data collection, model building, training, deployment, inference, and continuous monitoring. The diagram maps distinct privacy risks (shown in red boxes) to their corresponding stages within the lifecycle.
Figure 2: Architecture of the proposed PriMod4AI framework for automated privacy threat modeling in AI systems. The framework integrates a LINDDUN + AI-specific privacy threat knowledge base, DFD representation of AI systems, and open-source LLMs for prompt-based threat identification, producing structured JSON outputs.
Figure 3: DFD of the AI-based Face Authentication System, adapted from the open-source PILLAR repository
Figure 4: DFD of the autonomous driving system showing key processes, data stores, and data flows (DF1–DF14).

PriMod4AI: Lifecycle-Aware Privacy Threat Modeling for AI Systems using LLM

TL;DR

Abstract

PriMod4AI: Lifecycle-Aware Privacy Threat Modeling for AI Systems using LLM

Authors

TL;DR

Abstract

Table of Contents

Figures (4)