Table of Contents
Fetching ...

Conversation Routines: A Prompt Engineering Framework for Task-Oriented Dialog Systems

Giorgio Robino

TL;DR

This paper introduces Conversation Routines (CR), a prompt-based framework for building task-oriented dialog systems using LLMs by embedding business logic directly into prompts to guide control flow. It argues that CR enables domain experts to design complex conversational workflows while developers implement the supporting tools, addressing challenges of non-determinism and overhead. The authors validate CR with two use cases, a Train Ticket Booking System and an Interactive Troubleshooting Copilot, demonstrating how LLMs with function calling and optional multi-agent orchestration can maintain context, handle iterative and conditional procedures, and integrate with backend systems. They situate CR relative to SWARM and related frameworks, discuss benefits and limitations, and outline future work in evaluation methods, prompt optimization, resource efficiency, and balancing structured logic with conversational design to improve robustness and scalability in real-world applications.

Abstract

This study introduces Conversation Routines (CR), a structured prompt engineering framework for developing task-oriented dialog systems using Large Language Models (LLMs). While LLMs demonstrate remarkable natural language understanding capabilities, engineering them to reliably execute complex business workflows remains challenging. The proposed CR framework enables the development of Conversation Agentic Systems (CAS) through natural language specifications, embedding task-oriented logic within LLM prompts. This approach provides a systematic methodology for designing and implementing complex conversational workflows while maintaining behavioral consistency. We demonstrate the framework's effectiveness through two proof-of-concept implementations: a Train Ticket Booking System and an Interactive Troubleshooting Copilot. These case studies validate CR's capability to encode sophisticated behavioral patterns and decision logic while preserving natural conversational flexibility. Results show that CR enables domain experts to design conversational workflows in natural language while leveraging custom functions (tools) developed by software engineers, creating an efficient division of responsibilities where developers focus on core API implementation and domain experts handle conversation design. While the framework shows promise in accessibility and adaptability, we identify key challenges including computational overhead, non-deterministic behavior, and domain-specific logic optimization. Future research directions include CR evaluation methods based on prompt engineering frameworks driven by goal-oriented grading criteria, improving scalability for complex multi-agent interactions, and enhancing system robustness to address the identified limitations across diverse business applications.

Conversation Routines: A Prompt Engineering Framework for Task-Oriented Dialog Systems

TL;DR

This paper introduces Conversation Routines (CR), a prompt-based framework for building task-oriented dialog systems using LLMs by embedding business logic directly into prompts to guide control flow. It argues that CR enables domain experts to design complex conversational workflows while developers implement the supporting tools, addressing challenges of non-determinism and overhead. The authors validate CR with two use cases, a Train Ticket Booking System and an Interactive Troubleshooting Copilot, demonstrating how LLMs with function calling and optional multi-agent orchestration can maintain context, handle iterative and conditional procedures, and integrate with backend systems. They situate CR relative to SWARM and related frameworks, discuss benefits and limitations, and outline future work in evaluation methods, prompt optimization, resource efficiency, and balancing structured logic with conversational design to improve robustness and scalability in real-world applications.

Abstract

This study introduces Conversation Routines (CR), a structured prompt engineering framework for developing task-oriented dialog systems using Large Language Models (LLMs). While LLMs demonstrate remarkable natural language understanding capabilities, engineering them to reliably execute complex business workflows remains challenging. The proposed CR framework enables the development of Conversation Agentic Systems (CAS) through natural language specifications, embedding task-oriented logic within LLM prompts. This approach provides a systematic methodology for designing and implementing complex conversational workflows while maintaining behavioral consistency. We demonstrate the framework's effectiveness through two proof-of-concept implementations: a Train Ticket Booking System and an Interactive Troubleshooting Copilot. These case studies validate CR's capability to encode sophisticated behavioral patterns and decision logic while preserving natural conversational flexibility. Results show that CR enables domain experts to design conversational workflows in natural language while leveraging custom functions (tools) developed by software engineers, creating an efficient division of responsibilities where developers focus on core API implementation and domain experts handle conversation design. While the framework shows promise in accessibility and adaptability, we identify key challenges including computational overhead, non-deterministic behavior, and domain-specific logic optimization. Future research directions include CR evaluation methods based on prompt engineering frameworks driven by goal-oriented grading criteria, improving scalability for complex multi-agent interactions, and enhancing system robustness to address the identified limitations across diverse business applications.
Paper Structure (39 sections, 7 figures)

This paper contains 39 sections, 7 figures.

Figures (7)

  • Figure 1: A Conversational Agentic System for train ticket booking, showing how the LLM with control logic interacts with external functions.
  • Figure 2: A Single-Agent Conversational Agentic System Architecture. The figure illustrates the interaction between key components: the Conversation Routine, contextual data, and conversation history, all encapsulated within the LLM's context window. Inputs from domain experts and conversation designers define the agent's instructions and functions, while developers implement API wrappers and external system calls. The system dynamically processes user inputs through the LLM's chat completion capabilities, leveraging modular function calls and seamlessly returning results.
  • Figure 3: Collaboration Between Designers and Developers in a Conversation Agentic System.
  • Figure 4: Flowchart visualization of the train ticket booking workflow as specified in the LLM prompt. The diagram was generated by prompting an LLM to create a Mermaid flowchart representation of the prompt.
  • Figure 5: Token consumption increases linearly with context window accumulation, as shown in a 14-agent-turn dialog session using 36,208 tokens over 3 minutes and 41 seconds. Functions were correctly called, with search_railway_station() invoked 4 times to resolve source and destination ambiguities. The token count grows from an initial 2013 to 3644 as conversation history and function responses are progressively added to the context window.
  • ...and 2 more figures