A Roadmap for Tamed Interactions with Large Language Models
Vincenzo Scotti, Jan Keim, Tobias Hey, Andreas Metzger, Anne Koziolek, Raffaela Mirandola
TL;DR
This paper argues that the unreliability of large language models (LLMs) limits their use in automated pipelines and proposes a domain-specific language (LSL) to script and constrain LLM interactions, decoupled from training. By integrating templating, output constraints, non-linear workflow support, and formal verification/validation, LSL aims to improve reliability, robustness, and trustworthiness in AI-powered software. The authors outline a roadmap including foundational language/API design, benchmarks, interpreters, and human-centric and ethical considerations, while acknowledging standardisation and ecosystem development as essential for scalable adoption. The approach seeks to unify prompts, data, and code as first-class citizens within software engineering practices, enabling verifiable, explainable, and controllable LLM-powered applications with a DevOps-ready lifecycle.
Abstract
We are witnessing a bloom of AI-powered software driven by Large Language Models (LLMs). Although the applications of these LLMs are impressive and seemingly countless, their unreliability hinders adoption. In fact, the tendency of LLMs to produce faulty or hallucinated content makes them unsuitable for automating workflows and pipelines. In this regard, Software Engineering (SE) provides valuable support, offering a wide range of formal tools to specify, verify, and validate software behaviour. Such SE tools can be applied to define constraints over LLM outputs and, consequently, offer stronger guarantees on the generated content. In this paper, we argue that the development of a Domain Specific Language (DSL) for scripting interactions with LLMs using an LLM Scripting Language (LSL) may be key to improve AI-based applications. Currently, LLMs and LLM-based software still lack reliability, robustness, and trustworthiness, and the tools or frameworks to cope with these issues suffer from fragmentation. In this paper, we present our vision of LSL. With LSL, we aim to address the limitations above by exploring ways to control LLM outputs, enforce structure in interactions, and integrate these aspects with verification, validation, and explainability. Our goal is to make LLM interaction programmable and decoupled from training or implementation.
