Agent-S: LLM Agentic workflow to automate Standard Operating Procedures
Mandar Kulkarni
TL;DR
The paper tackles automating Standard Operating Procedures (SOPs) in customer care by proposing an LLM-driven agentic workflow. It models SOPs as a $DAG$ of steps and deploys three task-specific LLMs along with a Global Action Repository, execution memory, and multiple environments, augmented by a Retrieval Augmented Generation system for handling user questions. Empirical results on three seller SOPs show the agent can navigate complex flows in both synthetic and live chats, with the state decision component (GPT-4o-mini) achieving high accuracy and feedback loops effectively handling failures. The approach offers scalable, reusable automation for SOP-driven workflows and generalizes to other DAG-based processes in customer support and beyond.
Abstract
AI agents using Large Language Models (LLMs) as foundations have shown promise in solving complex real-world tasks. In this paper, we propose an LLM-based agentic workflow for automating Standard Operating Procedures (SOP). For customer care operations, an SOP defines a logical step-by-step process for human agents to resolve customer issues. We observe that any step in the SOP can be categorized as user interaction or API call, while the logical flow in the SOP defines the navigation. We use LLMs augmented with memory and environments (API tools, user interface, external knowledge source) for SOP automation. Our agentic architecture consists of three task-specific LLMs, a Global Action Repository (GAR), execution memory, and multiple environments. SOP workflow is written as a simple logical block of text. Based on the current execution memory and the SOP, the agent chooses the action to execute; it interacts with an appropriate environment (user/API) to collect observations and feedback, which are, in turn, inputted to memory to decide the next action. The agent is designed to be fault-tolerant, where it dynamically decides to repeat an action or seek input from an external knowledge source. We demonstrate the efficacy of the proposed agent on the three SOPs from the e-commerce seller domain. The experimental results validate the agent's performance under complex real-world scenarios.
