Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice

Ranim Khojah; Mazen Mohamad; Philipp Leitner; Francisco Gomes de Oliveira Neto

Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice

Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes de Oliveira Neto

TL;DR

This paper investigates how software engineers in industry actually use ChatGPT, revealing that practitioners predominantly seek guidance and learning rather than directly generating production code. It presents a theoretical framework that connects interaction purpose, user/internal factors (prompts, personality), and external factors (policies, data sources) to perceived usefulness and trust. The study analyzes 24 professionals over five days, using a combination of chat logs and exit surveys, and classifies dialogues into Artifact Manipulation, Expert Consultation, and Training. The findings suggest practical implications for enterprise AI deployment, prompt design, and future empirical research in LLM-assisted software engineering.

Abstract

Large Language Models (LLMs) are frequently discussed in academia and the general public as support tools for virtually any use case that relies on the production of text, including software engineering. Currently there is much debate, but little empirical evidence, regarding the practical usefulness of LLM-based tools such as ChatGPT for engineers in industry. We conduct an observational study of 24 professional software engineers who have been using ChatGPT over a period of one week in their jobs, and qualitatively analyse their dialogues with the chatbot as well as their overall experience (as captured by an exit survey). We find that, rather than expecting ChatGPT to generate ready-to-use software artifacts (e.g., code), practitioners more often use ChatGPT to receive guidance on how to solve their tasks or learn about a topic in more abstract terms. We also propose a theoretical framework for how (i) purpose of the interaction, (ii) internal factors (e.g., the user's personality), and (iii) external factors (e.g., company policy) together shape the experience (in terms of perceived usefulness and trust). We envision that our framework can be used by future research to further the academic discussion on LLM usage by software engineering practitioners, and to serve as a reference point for the design of future empirical LLM research in this domain.

Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice

TL;DR

Abstract

Paper Structure (19 sections, 5 figures, 3 tables)

This paper contains 19 sections, 5 figures, 3 tables.

Introduction
Related work
Methodology
Participants and data collection
Data analysis
Findings
Purpose
Artifact Manipulation
Expert Consultation
Training
Internal Factors
Prompts
Personality and expectations
External Factors
Personal Experience
...and 4 more sections

Figures (5)

Figure 1: The main steps followed in our observational study.
Figure 2: Decision tree to guide dialogue classification. The tree starts with determining if there is a practical problem. If yes, it checks if the user's goal is to be guided. If yes, it leads to Expert Consultation; if not, it checks if the user is looking for an executable solution leading to Artifact Manipulation or Expert Consultation. If there is no practical problem initially, it checks for a development of understanding in the dialogue leading either to Training or Expert Consultation.
Figure 3: A theoretical framework of the factors that influence the personal experience of interactions with ChatGPT in industrial software engineering.
Figure 4: Taxonomy of purposes for the usage of ChatGPT in software engineering.
Figure 5: Plots showing how the 23 participants reported ChatGPT's usefulness (left) and trust in its answer (right).

Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice

TL;DR

Abstract

Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice

Authors

TL;DR

Abstract

Table of Contents

Figures (5)