Table of Contents
Fetching ...

AgentSight: System-Level Observability for AI Agents Using eBPF

Yusheng Zheng, Yanpeng Hu, Tong Yu, Andi Quinn

TL;DR

This work tackles the challenge of monitoring AI agents that couple LLM reasoning with autonomous tool use, which creates a semantic gap between intended goals and observable actions. It introduces AgentSight, a boundary-tracing observability framework that uses in-kernel eBPF probes to intercept TLS-based intent and kernel events to capture actions, coupled with a real-time correlation engine and a secondary LLM for semantic analysis, achieving overhead below $3\%$ in realistic workloads. The key contributions include a principled boundary-tracing design, a multi-signal causality engine (Process Lineage, Temporal Proximity, Argument Matching), and demonstrations of detecting prompt injections, reasoning loops, and multi-agent coordination bottlenecks. The approach is instrumentation-free and framework-agnostic, offering a scalable, resilient path toward secure deployment of autonomous AI systems in production.

Abstract

Modern software infrastructure increasingly relies on LLM agents for development and maintenance, such as Claude Code and Gemini-cli. However, these AI agents differ fundamentally from traditional deterministic software, posing a significant challenge to conventional monitoring and debugging. This creates a critical semantic gap: existing tools observe either an agent's high-level intent (via LLM prompts) or its low-level actions (e.g., system calls), but cannot correlate these two views. This blindness makes it difficult to distinguish between benign operations, malicious attacks, and costly failures. We introduce AgentSight, an AgentOps observability framework that bridges this semantic gap using a hybrid approach. Our approach, boundary tracing, monitors agents from outside their application code at stable system interfaces using eBPF. AgentSight intercepts TLS-encrypted LLM traffic to extract semantic intent, monitors kernel events to observe system-wide effects, and causally correlates these two streams across process boundaries using a real-time engine and secondary LLM analysis. This instrumentation-free technique is framework-agnostic, resilient to rapid API changes, and incurs less than 3% performance overhead. Our evaluation shows AgentSight detects prompt injection attacks, identifies resource-wasting reasoning loops, and reveals hidden coordination bottlenecks in multi-agent systems. AgentSight is released as an open-source project at https://github.com/agent-sight/agentsight.

AgentSight: System-Level Observability for AI Agents Using eBPF

TL;DR

This work tackles the challenge of monitoring AI agents that couple LLM reasoning with autonomous tool use, which creates a semantic gap between intended goals and observable actions. It introduces AgentSight, a boundary-tracing observability framework that uses in-kernel eBPF probes to intercept TLS-based intent and kernel events to capture actions, coupled with a real-time correlation engine and a secondary LLM for semantic analysis, achieving overhead below in realistic workloads. The key contributions include a principled boundary-tracing design, a multi-signal causality engine (Process Lineage, Temporal Proximity, Argument Matching), and demonstrations of detecting prompt injections, reasoning loops, and multi-agent coordination bottlenecks. The approach is instrumentation-free and framework-agnostic, offering a scalable, resilient path toward secure deployment of autonomous AI systems in production.

Abstract

Modern software infrastructure increasingly relies on LLM agents for development and maintenance, such as Claude Code and Gemini-cli. However, these AI agents differ fundamentally from traditional deterministic software, posing a significant challenge to conventional monitoring and debugging. This creates a critical semantic gap: existing tools observe either an agent's high-level intent (via LLM prompts) or its low-level actions (e.g., system calls), but cannot correlate these two views. This blindness makes it difficult to distinguish between benign operations, malicious attacks, and costly failures. We introduce AgentSight, an AgentOps observability framework that bridges this semantic gap using a hybrid approach. Our approach, boundary tracing, monitors agents from outside their application code at stable system interfaces using eBPF. AgentSight intercepts TLS-encrypted LLM traffic to extract semantic intent, monitors kernel events to observe system-wide effects, and causally correlates these two streams across process boundaries using a real-time engine and secondary LLM analysis. This instrumentation-free technique is framework-agnostic, resilient to rapid API changes, and incurs less than 3% performance overhead. Our evaluation shows AgentSight detects prompt injection attacks, identifies resource-wasting reasoning loops, and reveals hidden coordination bottlenecks in multi-agent systems. AgentSight is released as an open-source project at https://github.com/agent-sight/agentsight.

Paper Structure

This paper contains 19 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Agent Framework Overview
  • Figure 2: AgentSight System Architecture.