Table of Contents
Fetching ...

What is it like to program with artificial intelligence?

Advait Sarkar, Andrew D. Gordon, Carina Negreanu, Christian Poelitz, Sruti Srinivasa Ragavan, Ben Zorn

TL;DR

Large language models enable automatic code generation and editor-integrated tooling that transform programming practice. The paper argues that LLM-assisted programming is a distinct modality, not fully captured by existing metaphors such as search, compilation, or pair programming, and synthesizes evidence from usability studies, experience reports, and end-user deployments. It identifies benefits like rapid boilerplate generation and flexible problem framing, while highlighting challenges in prompt engineering, verification, readability, safety, and governance, especially for non-expert end users. The work emphasizes the need for new design paradigms, evaluation methods, and governance approaches to harness LLMs responsibly and effectively across professional and end-user programming contexts.

Abstract

Large language models, such as OpenAI's codex and Deepmind's AlphaCode, can generate code to solve a variety of problems expressed in natural language. This technology has already been commercialised in at least one widely-used programming editor extension: GitHub Copilot. In this paper, we explore how programming with large language models (LLM-assisted programming) is similar to, and differs from, prior conceptualisations of programmer assistance. We draw upon publicly available experience reports of LLM-assisted programming, as well as prior usability and design studies. We find that while LLM-assisted programming shares some properties of compilation, pair programming, and programming via search and reuse, there are fundamental differences both in the technical possibilities as well as the practical experience. Thus, LLM-assisted programming ought to be viewed as a new way of programming with its own distinct properties and challenges. Finally, we draw upon observations from a user study in which non-expert end user programmers use LLM-assisted tools for solving data tasks in spreadsheets. We discuss the issues that might arise, and open research challenges, in applying large language models to end-user programming, particularly with users who have little or no programming expertise.

What is it like to program with artificial intelligence?

TL;DR

Large language models enable automatic code generation and editor-integrated tooling that transform programming practice. The paper argues that LLM-assisted programming is a distinct modality, not fully captured by existing metaphors such as search, compilation, or pair programming, and synthesizes evidence from usability studies, experience reports, and end-user deployments. It identifies benefits like rapid boilerplate generation and flexible problem framing, while highlighting challenges in prompt engineering, verification, readability, safety, and governance, especially for non-expert end users. The work emphasizes the need for new design paradigms, evaluation methods, and governance approaches to harness LLMs responsibly and effectively across professional and end-user programming contexts.

Abstract

Large language models, such as OpenAI's codex and Deepmind's AlphaCode, can generate code to solve a variety of problems expressed in natural language. This technology has already been commercialised in at least one widely-used programming editor extension: GitHub Copilot. In this paper, we explore how programming with large language models (LLM-assisted programming) is similar to, and differs from, prior conceptualisations of programmer assistance. We draw upon publicly available experience reports of LLM-assisted programming, as well as prior usability and design studies. We find that while LLM-assisted programming shares some properties of compilation, pair programming, and programming via search and reuse, there are fundamental differences both in the technical possibilities as well as the practical experience. Thus, LLM-assisted programming ought to be viewed as a new way of programming with its own distinct properties and challenges. Finally, we draw upon observations from a user study in which non-expert end user programmers use LLM-assisted tools for solving data tasks in spreadsheets. We discuss the issues that might arise, and open research challenges, in applying large language models to end-user programming, particularly with users who have little or no programming expertise.
Paper Structure (27 sections, 6 figures)

This paper contains 27 sections, 6 figures.

Figures (6)

  • Figure 1: Code generation using the GitHub Copilot editor extension. The portion highlighted in blue has been generated by the model. Left: a function body, generated based on a textual description in a comment. Right: a set of generated test cases. Source: copilot.github.com
  • Figure 2: Code generation with GitHub Copilot. The portion highlighted in blue has been generated by the model. Above: a pattern, extrapolated based on two examples. Below: a function body, generated from the signature and the first line. Source: copilot.github.com
  • Figure 3: Code generation using the Tabnine editor extension. The grey text after the cursor is being suggested by the model based on the comment on the preceding line. Source: tabnine.com
  • Figure 4: API suggestion using the Visual Studio IntelliCode feature. Source: silver_2018
  • Figure 5: Searching for code snippets using Bing Developer Assistant. A result for Stack Overflow is shown. Note how the query "generate md5 hash from string @line" contains a hint about the identifier line, which is used to rewrite the retrieved snippet. Source: https://www.microsoft.com/en-us/research/publication/building-bing-developer-assistant/
  • ...and 1 more figures