AgentCgroup: Understanding and Controlling OS Resources of AI Agents

Yusheng Zheng; Jiakun Fan; Quanzhi Fu; Yiwei Yang; Wei Zhang; Andi Quinn

AgentCgroup: Understanding and Controlling OS Resources of AI Agents

Yusheng Zheng, Jiakun Fan, Quanzhi Fu, Yiwei Yang, Wei Zhang, Andi Quinn

TL;DR

The paper tackles the problem of OS-level resource management for AI coding agents operating in sandboxed, multi-tenant cloud environments. It conducts a systematic characterization of resource dynamics across 144 SWE-rebench tasks using two LLM backends, revealing that OS-level execution accounts for the majority of latency and that memory is the principal bottleneck during concurrency, with highly bursty, tool-call-driven memory usage. Based on these findings, it proposes AgentCgroup, an eBPF-based controller that enforces fine-grained, tool-call-aligned resource domains via in-kernel scheduling and memory throttling, coupled with runtime-adaptive policies. Preliminary evaluation demonstrates improved multi-tenant isolation and reduced resource waste, highlighting the potential for kernel-level controls to address granularity, responsiveness, and adaptability gaps in existing resource management approaches for AI agents.

Abstract

AI agents are increasingly deployed in multi-tenant cloud environments, where they execute diverse tool calls within sandboxed containers, each call with distinct resource demands and rapid fluctuations. We present a systematic characterization of OS-level resource dynamics in sandboxed AI coding agents, analyzing 144 software engineering tasks from the SWE-rebench benchmark across two LLM models. Our measurements reveal that (1) OS-level execution (tool calls, container and agent initialization) accounts for 56-74% of end-to-end task latency; (2) memory, not CPU, is the concurrency bottleneck; (3) memory spikes are tool-call-driven with a up to 15.4x peak-to-average ratio; and (4) resource demands are highly unpredictable across tasks, runs, and models. Comparing these characteristics against serverless, microservice, and batch workloads, we identify three mismatches in existing resource controls: a granularity mismatch (container-level policies vs. tool-call-level dynamics), a responsiveness mismatch (user-space reaction vs. sub-second unpredictable bursts), and an adaptability mismatch (history-based prediction vs. non-deterministic stateful execution). We propose AgentCgroup , an eBPF-based resource controller that addresses these mismatches through hierarchical cgroup structures aligned with tool-call boundaries, in-kernel enforcement via sched_ext and memcg_bpf_ops, and runtime-adaptive policies driven by in-kernel monitoring. Preliminary evaluation demonstrates improved multi-tenant isolation and reduced resource waste.

AgentCgroup: Understanding and Controlling OS Resources of AI Agents

TL;DR

Abstract

AgentCgroup: Understanding and Controlling OS Resources of AI Agents

Authors

TL;DR

Abstract

Table of Contents

Figures (8)