Table of Contents
Fetching ...

Prompts Are Programs Too! Understanding How Developers Build Software Containing Prompts

Jenny T. Liang, Melissa Lin, Nikitha Rao, Brad A. Myers

TL;DR

This paper investigates prompt programming—the idea that prompts can function as programs within software powered by foundation models (FMs). Using Straussian grounded theory, it interviews 20 prompt programmers across contexts to derive 15 observations about how prompts are created, tested, and evolved in practice. It reveals that developers build mental models of FM behavior through interaction and experimentation, yet these models are often unreliable due to FM opacity and nondeterminism, making debugging and validation challenging. The study highlights fundamental differences from traditional software development, notably the centrality of data curation, the dual role of prompts as specifications and implementations, and the fragility of prompts across FM updates and models. The findings suggest a need for dedicated tools, workflows, and education to support prompt programming as a distinct engineering activity at the intersection of software engineering and prompt engineering, with implications for practitioners, researchers, and tool-makers.

Abstract

Generative pre-trained models power intelligent software features used by millions of users controlled by developer-written natural language prompts. Despite the impact of prompt-powered software, little is known about its development process and its relationship to programming. In this work, we argue that some prompts are programs and that the development of prompts is a distinct phenomenon in programming known as "prompt programming". We develop an understanding of prompt programming using Straussian grounded theory through interviews with 20 developers engaged in prompt development across a variety of contexts, models, domains, and prompt structures. We contribute 15 observations to form a preliminary understanding of current prompt programming practices. For example, rather than building mental models of code, prompt programmers develop mental models of the foundation model (FM)'s behavior on the prompt by interacting with the FM. While prior research shows that experts have well-formed mental models, we find that prompt programmers who have developed dozens of prompts still struggle to develop reliable mental models. Our observations show that prompt programming differs from traditional software development, motivating the creation of prompt programming tools and providing implications for software engineering stakeholders.

Prompts Are Programs Too! Understanding How Developers Build Software Containing Prompts

TL;DR

This paper investigates prompt programming—the idea that prompts can function as programs within software powered by foundation models (FMs). Using Straussian grounded theory, it interviews 20 prompt programmers across contexts to derive 15 observations about how prompts are created, tested, and evolved in practice. It reveals that developers build mental models of FM behavior through interaction and experimentation, yet these models are often unreliable due to FM opacity and nondeterminism, making debugging and validation challenging. The study highlights fundamental differences from traditional software development, notably the centrality of data curation, the dual role of prompts as specifications and implementations, and the fragility of prompts across FM updates and models. The findings suggest a need for dedicated tools, workflows, and education to support prompt programming as a distinct engineering activity at the intersection of software engineering and prompt engineering, with implications for practitioners, researchers, and tool-makers.

Abstract

Generative pre-trained models power intelligent software features used by millions of users controlled by developer-written natural language prompts. Despite the impact of prompt-powered software, little is known about its development process and its relationship to programming. In this work, we argue that some prompts are programs and that the development of prompts is a distinct phenomenon in programming known as "prompt programming". We develop an understanding of prompt programming using Straussian grounded theory through interviews with 20 developers engaged in prompt development across a variety of contexts, models, domains, and prompt structures. We contribute 15 observations to form a preliminary understanding of current prompt programming practices. For example, rather than building mental models of code, prompt programmers develop mental models of the foundation model (FM)'s behavior on the prompt by interacting with the FM. While prior research shows that experts have well-formed mental models, we find that prompt programmers who have developed dozens of prompts still struggle to develop reliable mental models. Our observations show that prompt programming differs from traditional software development, motivating the creation of prompt programming tools and providing implications for software engineering stakeholders.
Paper Structure (57 sections, 5 figures, 2 tables)

This paper contains 57 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: An overview of the 15 study findings on the barriers in prompt programming. These can be divided into four types of barriers: understanding FM behavior ($_{}$), dealing with stochasticity ($_{}$), programming in natural language ($_{}$), and testing prompt program behavior ($_{}$).
  • Figure 2: An overview of the Straussian grounded theory methodology performed in the study.
  • Figure 3: A subset of the interview questions. The full protocol is in the supplemental materials supplemental-materials
  • Figure 4: A programmer uses their prior experience with the foundation model (FM) to construct a mental model of how the FM may perform on the task. The programmer uses this mental model to write the prompt. After observing how the model responds, the developer uses this new information to update their mental model, which influences the next iteration of the prompt.
  • Figure 5: Recommendations for prompt programming for educators and practitioners.