Table of Contents
Fetching ...

Understanding the Human-LLM Dynamic: A Literature Survey of LLM Use in Programming Tasks

Deborah Etsenake, Meiyappan Nagappan

TL;DR

This literature survey analyzes how programmers interact with transformer-based LLMs in coding tasks, synthesizing findings from 88 post-2017 studies to map interaction patterns, human enhancement, and task performance. It identifies three core themes in interaction data and proposes a standardized set of metrics and prompting strategies to manage LLM non-determinism and improve usability. The paper finds that LLMs can boost time productivity and learning, but effects on task performance are mixed and highly task-dependent, underscoring the need for cross-model validation. It outlines concrete directions for future work, including quantitative studies of interaction patterns and broader validation across diverse LLM models to strengthen guidance for researchers and practitioners.

Abstract

Large Language Models (LLMs) are transforming programming practices, offering significant capabilities for code generation activities. While researchers have explored the potential of LLMs in various domains, this paper focuses on their use in programming tasks, drawing insights from user studies that assess the impact of LLMs on programming tasks. We first examined the user interaction behaviors with LLMs observed in these studies, from the types of requests made to task completion strategies. Additionally, our analysis reveals both benefits and weaknesses of LLMs showing mixed effects on the human and task. Lastly, we looked into what factors from the human, LLM or the interaction of both, affect the human's enhancement as well as the task performance. Our findings highlight the variability in human-LLM interactions due to the non-deterministic nature of both parties (humans and LLMs), underscoring the need for a deeper understanding of these interaction patterns. We conclude by providing some practical suggestions for researchers as well as programmers.

Understanding the Human-LLM Dynamic: A Literature Survey of LLM Use in Programming Tasks

TL;DR

This literature survey analyzes how programmers interact with transformer-based LLMs in coding tasks, synthesizing findings from 88 post-2017 studies to map interaction patterns, human enhancement, and task performance. It identifies three core themes in interaction data and proposes a standardized set of metrics and prompting strategies to manage LLM non-determinism and improve usability. The paper finds that LLMs can boost time productivity and learning, but effects on task performance are mixed and highly task-dependent, underscoring the need for cross-model validation. It outlines concrete directions for future work, including quantitative studies of interaction patterns and broader validation across diverse LLM models to strengthen guidance for researchers and practitioners.

Abstract

Large Language Models (LLMs) are transforming programming practices, offering significant capabilities for code generation activities. While researchers have explored the potential of LLMs in various domains, this paper focuses on their use in programming tasks, drawing insights from user studies that assess the impact of LLMs on programming tasks. We first examined the user interaction behaviors with LLMs observed in these studies, from the types of requests made to task completion strategies. Additionally, our analysis reveals both benefits and weaknesses of LLMs showing mixed effects on the human and task. Lastly, we looked into what factors from the human, LLM or the interaction of both, affect the human's enhancement as well as the task performance. Our findings highlight the variability in human-LLM interactions due to the non-deterministic nature of both parties (humans and LLMs), underscoring the need for a deeper understanding of these interaction patterns. We conclude by providing some practical suggestions for researchers as well as programmers.
Paper Structure (52 sections, 2 figures, 8 tables)

This paper contains 52 sections, 2 figures, 8 tables.

Figures (2)

  • Figure 1: Human enhancement themes categorized by the number of papers reporting positive, neutral, and negative effects.
  • Figure 2: The LLM response Evaluation metric results as examined in the papers and grouped them into number of papers reporting positive effects, negative and neutral/average effects