Table of Contents
Fetching ...

Terminal Agents Suffice for Enterprise Automation

Patrice Bechard, Orlando Marquez Ayala, Emily Chen, Jordan Skelton, Sagar Davasam, Srinivas Sunkara, Vikas Yadav, Sai Rajeswar

Abstract

There has been growing interest in building agents that can interact with digital platforms to execute meaningful enterprise tasks autonomously. Among the approaches explored are tool-augmented agents built on abstractions such as Model Context Protocol (MCP) and web agents that operate through graphical interfaces. Yet, it remains unclear whether such complex agentic systems are necessary given their cost and operational overhead. We argue that a coding agent equipped only with a terminal and a filesystem can solve many enterprise tasks more effectively by interacting directly with platform APIs. We evaluate this hypothesis across diverse real-world systems and show that these low-level terminal agents match or outperform more complex agent architectures. Our findings suggest that simple programmatic interfaces, combined with strong foundation models, are sufficient for practical enterprise automation.

Terminal Agents Suffice for Enterprise Automation

Abstract

There has been growing interest in building agents that can interact with digital platforms to execute meaningful enterprise tasks autonomously. Among the approaches explored are tool-augmented agents built on abstractions such as Model Context Protocol (MCP) and web agents that operate through graphical interfaces. Yet, it remains unclear whether such complex agentic systems are necessary given their cost and operational overhead. We argue that a coding agent equipped only with a terminal and a filesystem can solve many enterprise tasks more effectively by interacting directly with platform APIs. We evaluate this hypothesis across diverse real-world systems and show that these low-level terminal agents match or outperform more complex agent architectures. Our findings suggest that simple programmatic interfaces, combined with strong foundation models, are sufficient for practical enterprise automation.

Paper Structure

This paper contains 50 sections, 6 figures, 14 tables.

Figures (6)

  • Figure 1: Execution traces of agents ordering an iPad Pro. (Top-left) The MCP agent identifies the catalog item but cannot proceed without an ordering tool. It falls back to creating a support ticket and fails. (Top-right) The web agent reaches the catalog page but becomes confused within the iframe-based UI, leading to a long, costly trajectory that fails to complete the order. (Bottom) The terminal agent encounters JSON quoting errors when constructing the request payload and 404 responses from incorrect API endpoints, but recovers by writing the payload to a temporary file and exploring alternative endpoints. It completes the task at an order of magnitude lower cost than the web agent, demonstrating the flexibility, resilience, and efficiency of direct terminal-based interaction.
  • Figure 2: Overview of StarShell: a minimal terminal agent for enterprise automation. The agent operates through a terminal and filesystem, optionally using documentation and persistent skills to discover and invoke APIs directly on enterprise platforms (e.g., GitLab, ServiceNow, ERPNext), without relying on GUI interaction or pre-defined tool registries.
  • Figure 3: Skills accumulation over sequential tasks. Top: cumulative number of successful tasks. Middle: cumulative cost ($USD). Bottom: skills directory size (KB). The agent with memory (blue) accumulates reusable procedures; the baseline (black) starts fresh every time.
  • Figure 4: Distribution of tool calls per task for successful and failed tasks. Tasks exceeding 50 calls are capped at 50.
  • Figure 5: API error type breakdown as a percentage of total tool calls, comparing successful and failed tasks.
  • ...and 1 more figures