Table of Contents
Fetching ...

A Roadmap for Tamed Interactions with Large Language Models

Vincenzo Scotti, Jan Keim, Tobias Hey, Andreas Metzger, Anne Koziolek, Raffaela Mirandola

TL;DR

This paper argues that the unreliability of large language models (LLMs) limits their use in automated pipelines and proposes a domain-specific language (LSL) to script and constrain LLM interactions, decoupled from training. By integrating templating, output constraints, non-linear workflow support, and formal verification/validation, LSL aims to improve reliability, robustness, and trustworthiness in AI-powered software. The authors outline a roadmap including foundational language/API design, benchmarks, interpreters, and human-centric and ethical considerations, while acknowledging standardisation and ecosystem development as essential for scalable adoption. The approach seeks to unify prompts, data, and code as first-class citizens within software engineering practices, enabling verifiable, explainable, and controllable LLM-powered applications with a DevOps-ready lifecycle.

Abstract

We are witnessing a bloom of AI-powered software driven by Large Language Models (LLMs). Although the applications of these LLMs are impressive and seemingly countless, their unreliability hinders adoption. In fact, the tendency of LLMs to produce faulty or hallucinated content makes them unsuitable for automating workflows and pipelines. In this regard, Software Engineering (SE) provides valuable support, offering a wide range of formal tools to specify, verify, and validate software behaviour. Such SE tools can be applied to define constraints over LLM outputs and, consequently, offer stronger guarantees on the generated content. In this paper, we argue that the development of a Domain Specific Language (DSL) for scripting interactions with LLMs using an LLM Scripting Language (LSL) may be key to improve AI-based applications. Currently, LLMs and LLM-based software still lack reliability, robustness, and trustworthiness, and the tools or frameworks to cope with these issues suffer from fragmentation. In this paper, we present our vision of LSL. With LSL, we aim to address the limitations above by exploring ways to control LLM outputs, enforce structure in interactions, and integrate these aspects with verification, validation, and explainability. Our goal is to make LLM interaction programmable and decoupled from training or implementation.

A Roadmap for Tamed Interactions with Large Language Models

TL;DR

This paper argues that the unreliability of large language models (LLMs) limits their use in automated pipelines and proposes a domain-specific language (LSL) to script and constrain LLM interactions, decoupled from training. By integrating templating, output constraints, non-linear workflow support, and formal verification/validation, LSL aims to improve reliability, robustness, and trustworthiness in AI-powered software. The authors outline a roadmap including foundational language/API design, benchmarks, interpreters, and human-centric and ethical considerations, while acknowledging standardisation and ecosystem development as essential for scalable adoption. The approach seeks to unify prompts, data, and code as first-class citizens within software engineering practices, enabling verifiable, explainable, and controllable LLM-powered applications with a DevOps-ready lifecycle.

Abstract

We are witnessing a bloom of AI-powered software driven by Large Language Models (LLMs). Although the applications of these LLMs are impressive and seemingly countless, their unreliability hinders adoption. In fact, the tendency of LLMs to produce faulty or hallucinated content makes them unsuitable for automating workflows and pipelines. In this regard, Software Engineering (SE) provides valuable support, offering a wide range of formal tools to specify, verify, and validate software behaviour. Such SE tools can be applied to define constraints over LLM outputs and, consequently, offer stronger guarantees on the generated content. In this paper, we argue that the development of a Domain Specific Language (DSL) for scripting interactions with LLMs using an LLM Scripting Language (LSL) may be key to improve AI-based applications. Currently, LLMs and LLM-based software still lack reliability, robustness, and trustworthiness, and the tools or frameworks to cope with these issues suffer from fragmentation. In this paper, we present our vision of LSL. With LSL, we aim to address the limitations above by exploring ways to control LLM outputs, enforce structure in interactions, and integrate these aspects with verification, validation, and explainability. Our goal is to make LLM interaction programmable and decoupled from training or implementation.

Paper Structure

This paper contains 31 sections, 5 figures.

Figures (5)

  • Figure 1: llm prompting approaches.
  • Figure 2: Idea behind lsl: the domain expert uses the dsl to write scripts for different tasks, end users accessing their ai-powered applications or servers running their ai-powered applications interact with the llm selecting the script corresponding to the required functionality.
  • Figure 3: Example semi-structured use case (system messages, instructions, data, and subroutine outputs): knowledge-grounded chat.
  • Figure 4: Example structured use case (system messages, instructions, data and subroutine outputs): database creation
  • Figure 5: Roadmap to the development of lsl.