LLM-enabled Applications Require System-Level Threat Monitoring

Yedi Zhang; Haoyu Wang; Xianglin Yang; Jin Song Dong; Jun Sun

LLM-enabled Applications Require System-Level Threat Monitoring

Yedi Zhang, Haoyu Wang, Xianglin Yang, Jin Song Dong, Jun Sun

TL;DR

The paper argues that deploying LLM-enabled applications creates new reliability and security risks that extend beyond model improvements, necessitating system-level threat monitoring akin to EDR for traditional software. It introduces a taxonomy-driven monitoring schema that links fourteen threat categories to concrete monitoring artifacts and audit-logging practices across the end-to-end workflow, including prompt injection, adversarial inputs, response manipulation, DoS, data poisoning, model poisoning, data leakage, cross-context disclosure, memorisation leakage, theft, watermark evasion, drift, misinformation, and misuse. The authors emphasize post-monitoring incident analysis and reveal challenges such as corpus curation for suspicious patterns, context-inspection latency, and limited observability in closed deployments, while offering alternative views like red-teaming and guardrails as complementary approaches. Overall, the work advocates continuous, system-wide telemetry and forensic capabilities as prerequisites for reliable operation and robust incident-response in LLM-enabled applications, enabling timely detection, containment, and recovery.

Abstract

LLM-enabled applications are rapidly reshaping the software ecosystem by using large language models as core reasoning components for complex task execution. This paradigm shift, however, introduces fundamentally new reliability challenges and significantly expands the security attack surface, due to the non-deterministic, learning-driven, and difficult-to-verify nature of LLM behavior. In light of these emerging and unavoidable safety challenges, we argue that such risks should be treated as expected operational conditions rather than exceptional events, necessitating a dedicated incident-response perspective. Consequently, the primary barrier to trustworthy deployment is not further improving model capability but establishing system-level threat monitoring mechanisms that can detect and contextualize security-relevant anomalies after deployment -- an aspect largely underexplored beyond testing or guardrail-based defenses. Accordingly, this position paper advocates systematic and comprehensive monitoring of security threats in LLM-enabled applications as a prerequisite for reliable operation and a foundation for dedicated incident-response frameworks.

LLM-enabled Applications Require System-Level Threat Monitoring

TL;DR

Abstract

Paper Structure (97 sections, 1 figure, 1 table)

This paper contains 97 sections, 1 figure, 1 table.

Introduction
Preliminaries and Scope
Preliminaries
Scope
System-Level Threat Monitoring Schema
Prompt Injection
Attack Vectors & Monitoring Artifacts
Direct Prompt Injection
Injected Instructions in RAG
Service API Outputs
Audit Logging
Adversarial Inputs
Attack Vectors & Monitoring Artifacts
Lexical Obfuscation
Embedding-level Attacks
...and 82 more sections

Figures (1)

Figure 1: A respresentative LLM-enabled application workflow: The user submits an initial prompt (Stage 1); the client, responsible for orchestration, queries the MCP service for available tools (Stage 2) and forwards an integrated prompt to the LLM brain (Stage 3), which may interact with external resources such as vector databases, (Graph)-RAG systems, and memory (Stage 3*). The brain produces intermediate responses and tool plans (Stage 4); the client executes the selected tools via MCP and gathers results (Stage 5), assembles the final prompt (Stage 6), obtains the final response from the brain (Stage 7), and delivers it to the user (Stage 8).

LLM-enabled Applications Require System-Level Threat Monitoring

TL;DR

Abstract

LLM-enabled Applications Require System-Level Threat Monitoring

Authors

TL;DR

Abstract

Table of Contents

Figures (1)