Table of Contents
Fetching ...

AgentOps: Enabling Observability of LLM Agents

Liming Dong, Qinghua Lu, Liming Zhu

TL;DR

This paper tackles AI safety concerns in autonomous LLM agents by emphasizing the need for strong observability. It conducts a systematic mapping study of existing AgentOps tools to identify capabilities and gaps, and proposes an artifact relationship model plus a comprehensive AgentOps taxonomy to guide instrumentation for monitoring, logging, and analytics. The study analyzes 17 tools, highlighting core features such as customization, prompt management, evaluation, feedback, monitoring, tracing, and guardrails, and outlines an ER model and a hierarchical span-based taxonomy for tracing agent artifacts. The work provides a structured reference template for developers to design AgentOps infrastructures, with future work focused on real-world validation and prototype development to advance AI safety and observability in LLM-powered agents.

Abstract

Large language model (LLM) agents have demonstrated remarkable capabilities across various domains, gaining extensive attention from academia and industry. However, these agents raise significant concerns on AI safety due to their autonomous and non-deterministic behavior, as well as continuous evolving nature . From a DevOps perspective, enabling observability in agents is necessary to ensuring AI safety, as stakeholders can gain insights into the agents' inner workings, allowing them to proactively understand the agents, detect anomalies, and prevent potential failures. Therefore, in this paper, we present a comprehensive taxonomy of AgentOps, identifying the artifacts and associated data that should be traced throughout the entire lifecycle of agents to achieve effective observability. The taxonomy is developed based on a systematic mapping study of existing AgentOps tools. Our taxonomy serves as a reference template for developers to design and implement AgentOps infrastructure that supports monitoring, logging, and analytics. thereby ensuring AI safety.

AgentOps: Enabling Observability of LLM Agents

TL;DR

This paper tackles AI safety concerns in autonomous LLM agents by emphasizing the need for strong observability. It conducts a systematic mapping study of existing AgentOps tools to identify capabilities and gaps, and proposes an artifact relationship model plus a comprehensive AgentOps taxonomy to guide instrumentation for monitoring, logging, and analytics. The study analyzes 17 tools, highlighting core features such as customization, prompt management, evaluation, feedback, monitoring, tracing, and guardrails, and outlines an ER model and a hierarchical span-based taxonomy for tracing agent artifacts. The work provides a structured reference template for developers to design AgentOps infrastructures, with future work focused on real-world validation and prototype development to advance AI safety and observability in LLM-powered agents.

Abstract

Large language model (LLM) agents have demonstrated remarkable capabilities across various domains, gaining extensive attention from academia and industry. However, these agents raise significant concerns on AI safety due to their autonomous and non-deterministic behavior, as well as continuous evolving nature . From a DevOps perspective, enabling observability in agents is necessary to ensuring AI safety, as stakeholders can gain insights into the agents' inner workings, allowing them to proactively understand the agents, detect anomalies, and prevent potential failures. Therefore, in this paper, we present a comprehensive taxonomy of AgentOps, identifying the artifacts and associated data that should be traced throughout the entire lifecycle of agents to achieve effective observability. The taxonomy is developed based on a systematic mapping study of existing AgentOps tools. Our taxonomy serves as a reference template for developers to design and implement AgentOps infrastructure that supports monitoring, logging, and analytics. thereby ensuring AI safety.

Paper Structure

This paper contains 22 sections, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Search Process of AgentOps Relevant Tools
  • Figure 2: Entity-Relationship Model for Agent Artifacts
  • Figure 3: Taxonomy of AgentOps