DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs

Justin Albrethsen; Yash Datta; Kunal Kumar; Sharath Rajasekar

DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs

Justin Albrethsen, Yash Datta, Kunal Kumar, Sharath Rajasekar

TL;DR

DeepContext introduces a stateful, recurrent framework to detect multi-turn adversarial intent drift in LLMs, addressing the Safety Gap left by stateless guardrails. By combining task-attention weighted turn embeddings with a GRU-based memory and a trajectory classifier, it tracks intent evolution across turns rather than re-evaluating static histories. The approach achieves a state-of-the-art F1 of 0.84 on multi-turn jailbreak benchmarks with sub-20 ms latency on a T4 GPU, outperforming both lightweight encoders and larger stateless models. This work demonstrates that modeling the sequential evolution of user intent offers superior protection with lower computational cost, enabling real-time defense in enterprise and agentic settings.

Abstract

While Large Language Model (LLM) capabilities have scaled, safety guardrails remain largely stateless, treating multi-turn dialogues as a series of disconnected events. This lack of temporal awareness facilitates a "Safety Gap" where adversarial tactics, like Crescendo and ActorAttack, slowly bleed malicious intent across turn boundaries to bypass stateless filters. We introduce DeepContext, a stateful monitoring framework designed to map the temporal trajectory of user intent. DeepContext discards the isolated evaluation model in favor of a Recurrent Neural Network (RNN) architecture that ingests a sequence of fine-tuned turn-level embeddings. By propagating a hidden state across the conversation, DeepContext captures the incremental accumulation of risk that stateless models overlook. Our evaluation demonstrates that DeepContext significantly outperforms existing baselines in multi-turn jailbreak detection, achieving a state-of-the-art F1 score of 0.84, which represents a substantial improvement over both hyperscaler cloud-provider guardrails and leading open-weight models such as Llama-Prompt-Guard-2 (0.67) and Granite-Guardian (0.67). Furthermore, DeepContext maintains a sub-20ms inference overhead on a T4 GPU, ensuring viability for real-time applications. These results suggest that modeling the sequential evolution of intent is a more effective and computationally efficient alternative to deploying massive, stateless models.

DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs

TL;DR

Abstract

Paper Structure (35 sections, 10 equations, 1 figure, 7 tables)

This paper contains 35 sections, 10 equations, 1 figure, 7 tables.

Introduction
The Computational Bottleneck of Current Defenses
DeepContext: Statefulness Through Recurrent Intent Tracking
Related Works
The Attacker Landscape: Cognitive and Sequential Exploits
The Defender Landscape: The Stateless Limitation
DeepContext: Attention-Weighted Recurrent Tracking
Methodology: Stateful Intent Tracking via Recurrent Latent Embeddings
Problem Formulation: Adversarial Accumulation
Model Architecture: The DeepContext Pipeline
Task-Attention Weighted Encoder
Recurrent Intent Tracking (RIT) via Gated Recurrent Units
Projection Layer and Residual Shortcuts
Trajectory Classifier
Training and Dataset compilation
...and 20 more sections

Figures (1)

Figure 1: The DeepContext Architecture. The pipeline consists of three main stages: (1) Turn-level embedding extraction using a fine-tuned BERT and task-specific weighted pooling; (2) Recurrent intent tracking via a GRU to maintain conversation state; and (3) A trajectory classifier with a hybrid residual connection for final safety scoring.

DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs

TL;DR

Abstract

DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (1)