Towards Machine-Generated Code for the Resolution of User Intentions
Justus Flerlage, Ilja Behnke, Odej Kao
TL;DR
The paper investigates whether natural-language user intentions can be translated into executable workflows by prompting an LLM to generate and run code against a simplified GUI-less OS API. It proposes a system architecture consisting of an LLM Service, a Controller, a Function Table of callable OS interfaces, and an Executor to realize intentions as code, and validates this with a GPT-4o-mini model and a constrained Python execution environment. The study demonstrates general feasibility, with most tested intents being resolved through generated code, and analyzes timing metrics and failure modes related to sandboxing and scoping. It highlights security considerations and outlines directions for optimization and potential on-device deployment, emphasizing the shift toward hybrid, intent-driven human–AI collaboration.
Abstract
The growing capabilities of Artificial Intelligence (AI), particularly Large Language Models (LLMs), prompt a reassessment of the interaction mechanisms between users and their devices. Currently, users are required to use a set of high-level applications to achieve their desired results. However, the advent of AI may signal a shift in this regard, as its capabilities have generated novel prospects for user-provided intent resolution through the deployment of model-generated code. This development represents a significant progression in the realm of hybrid workflows, where human and artificial intelligence collaborate to address user intentions, with the former responsible for defining these intentions and the latter for implementing the solutions to address them. In this paper, we investigate the feasibility of generating and executing workflows through code generation that results from prompting an LLM with a concrete user intention, and a simplified application programming interface for a GUI-less operating system. We provide an in-depth analysis and comparison of various user intentions, the resulting code, and its execution. The findings demonstrate the general feasibility of our approach and that the employed LLM, GPT-4o-mini, exhibits remarkable proficiency in the generation of code-oriented workflows in accordance with provided user intentions.
