Table of Contents
Fetching ...

Examining the Use and Impact of an AI Code Assistant on Developer Productivity and Experience in the Enterprise

Justin D. Weisz, Shraddha Kumar, Michael Muller, Karen-Ellen Browne, Arielle Goldberg, Ellice Heintze, Shagun Bajpai

TL;DR

This paper analyzes the use and impact of an enterprise AI code assistant, watsonx Code Assistant (WCA), on developer productivity and experience at IBM. Using a large-scale survey and unmoderated usability testing, it finds a small overall net productivity gain but with substantial variability across users and contexts. It identifies code understanding as the dominant use case, highlights the co-creative nature of human-AI work, and surfaces governance concerns around authorship and copyright in generated code. The findings inform design, policy, and education for deploying AI code assistants in large organizations and point to pathways to improve adoption and trust as models mature.

Abstract

AI assistants are being created to help software engineers conduct a variety of coding-related tasks, such as writing, documenting, and testing code. We describe the use of the watsonx Code Assistant (WCA), an LLM-powered coding assistant deployed internally within IBM. Through surveys of two user cohorts (N=669) and unmoderated usability testing (N=15), we examined developers' experiences with WCA and its impact on their productivity. We learned about their motivations for using (or not using) WCA, we examined their expectations of its speed and quality, and we identified new considerations regarding ownership of and responsibility for generated code. Our case study characterizes the impact of an LLM-powered assistant on developers' perceptions of productivity and it shows that although such tools do often provide net productivity increases, these benefits may not always be experienced by all users.

Examining the Use and Impact of an AI Code Assistant on Developer Productivity and Experience in the Enterprise

TL;DR

This paper analyzes the use and impact of an enterprise AI code assistant, watsonx Code Assistant (WCA), on developer productivity and experience at IBM. Using a large-scale survey and unmoderated usability testing, it finds a small overall net productivity gain but with substantial variability across users and contexts. It identifies code understanding as the dominant use case, highlights the co-creative nature of human-AI work, and surfaces governance concerns around authorship and copyright in generated code. The findings inform design, policy, and education for deploying AI code assistants in large organizations and point to pathways to improve adoption and trust as models mature.

Abstract

AI assistants are being created to help software engineers conduct a variety of coding-related tasks, such as writing, documenting, and testing code. We describe the use of the watsonx Code Assistant (WCA), an LLM-powered coding assistant deployed internally within IBM. Through surveys of two user cohorts (N=669) and unmoderated usability testing (N=15), we examined developers' experiences with WCA and its impact on their productivity. We learned about their motivations for using (or not using) WCA, we examined their expectations of its speed and quality, and we identified new considerations regarding ownership of and responsibility for generated code. Our case study characterizes the impact of an LLM-powered assistant on developers' perceptions of productivity and it shows that although such tools do often provide net productivity increases, these benefits may not always be experienced by all users.

Paper Structure

This paper contains 41 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Survey respondent demographics: (a) years of experience as a professional software engineer, (b) tenure with IBM, and (c) geography.
  • Figure 2: Distributions of productivity and quality measures: (a) self-reported ratings of effort, quality of work, and speed on 7-point semantic differential scales (centered at 0), (b) distribution of self-efficacy scores, and (c) distribution of overall quality scores.
  • Figure 3: Distributions of purposes of use of WCA and views on code authorship.