A Practical Approach for Building Production-Grade Conversational Agents with Workflow Graphs
Chiwan Park, Wonjun Jang, Daeryong Kim, Aelim Ahn, Kichang Yang, Woosung Hwang, Jihyeon Roh, Hyerin Park, Hyosun Wang, Min Seok Kim, Jihoon Kang
TL;DR
The paper tackles the challenge of deploying production-grade conversational agents by reconciling flexible LLM behavior with strict domain constraints through a graph-based workflow (DAG) framework. It introduces per-node prompts, constrained decoding, and history manipulation, complemented by a data-collection pipeline using a prototype agent and loss-masked fine-tuning to preserve node-specific guidance. The authors demonstrate a real-world e-commerce application, achieving a 52% improvement in task accuracy and a 50% improvement in format adherence over baselines and GPT-4o, with the internal model even surpassing GPT-4o in several metrics. The framework offers a scalable, controllable approach for building reliable AI agents suitable for mobile messaging and other constraint-heavy domains. The work bridges research and practice, enabling production-ready AI assistants with robust tooling, evaluation, and governance considerations.
Abstract
The advancement of Large Language Models (LLMs) has led to significant improvements in various service domains, including search, recommendation, and chatbot applications. However, applying state-of-the-art (SOTA) research to industrial settings presents challenges, as it requires maintaining flexible conversational abilities while also strictly complying with service-specific constraints. This can be seen as two conflicting requirements due to the probabilistic nature of LLMs. In this paper, we propose our approach to addressing this challenge and detail the strategies we employed to overcome their inherent limitations in real-world applications. We conduct a practical case study of a conversational agent designed for the e-commerce domain, detailing our implementation workflow and optimizations. Our findings provide insights into bridging the gap between academic research and real-world application, introducing a framework for developing scalable, controllable, and reliable AI-driven agents.
