Conversation Routines: A Prompt Engineering Framework for Task-Oriented Dialog Systems
Giorgio Robino
TL;DR
This paper introduces Conversation Routines (CR), a prompt-based framework for building task-oriented dialog systems using LLMs by embedding business logic directly into prompts to guide control flow. It argues that CR enables domain experts to design complex conversational workflows while developers implement the supporting tools, addressing challenges of non-determinism and overhead. The authors validate CR with two use cases, a Train Ticket Booking System and an Interactive Troubleshooting Copilot, demonstrating how LLMs with function calling and optional multi-agent orchestration can maintain context, handle iterative and conditional procedures, and integrate with backend systems. They situate CR relative to SWARM and related frameworks, discuss benefits and limitations, and outline future work in evaluation methods, prompt optimization, resource efficiency, and balancing structured logic with conversational design to improve robustness and scalability in real-world applications.
Abstract
This study introduces Conversation Routines (CR), a structured prompt engineering framework for developing task-oriented dialog systems using Large Language Models (LLMs). While LLMs demonstrate remarkable natural language understanding capabilities, engineering them to reliably execute complex business workflows remains challenging. The proposed CR framework enables the development of Conversation Agentic Systems (CAS) through natural language specifications, embedding task-oriented logic within LLM prompts. This approach provides a systematic methodology for designing and implementing complex conversational workflows while maintaining behavioral consistency. We demonstrate the framework's effectiveness through two proof-of-concept implementations: a Train Ticket Booking System and an Interactive Troubleshooting Copilot. These case studies validate CR's capability to encode sophisticated behavioral patterns and decision logic while preserving natural conversational flexibility. Results show that CR enables domain experts to design conversational workflows in natural language while leveraging custom functions (tools) developed by software engineers, creating an efficient division of responsibilities where developers focus on core API implementation and domain experts handle conversation design. While the framework shows promise in accessibility and adaptability, we identify key challenges including computational overhead, non-deterministic behavior, and domain-specific logic optimization. Future research directions include CR evaluation methods based on prompt engineering frameworks driven by goal-oriented grading criteria, improving scalability for complex multi-agent interactions, and enhancing system robustness to address the identified limitations across diverse business applications.
