Table of Contents
Fetching ...

An Empirical Study on Usage and Perceptions of LLMs in a Software Engineering Project

Sanka Rasnayaka, Guanlin Wang, Ridwan Shariffdeen, Ganesh Neelakanta Iyer

TL;DR

This study investigates the usefulness of large language models in an academic software engineering project by analyzing AI-generated code, prompts, and human intervention across a 214-student, team-based SPA project in the SIMPLE language. It combines artifact analysis with a UTAUT-based perception survey to assess adoption, usefulness, and influencing factors. Findings show LLMs effectively bootstrap early development and debugging, with adoption contingent on coding skill and prior AI experience, while maintaining comparable code quality when proper human vetting is applied. The work offers educators a blueprint for integrating human-AI collaboration into software engineering curricula and highlights the need to teach prompt engineering and critical evaluation of AI-generated code.

Abstract

Large Language Models (LLMs) represent a leap in artificial intelligence, excelling in tasks using human language(s). Although the main focus of general-purpose LLMs is not code generation, they have shown promising results in the domain. However, the usefulness of LLMs in an academic software engineering project has not been fully explored yet. In this study, we explore the usefulness of LLMs for 214 students working in teams consisting of up to six members. Notably, in the academic course through which this study is conducted, students were encouraged to integrate LLMs into their development tool-chain, in contrast to most other academic courses that explicitly prohibit the use of LLMs. In this paper, we analyze the AI-generated code, prompts used for code generation, and the human intervention levels to integrate the code into the code base. We also conduct a perception study to gain insights into the perceived usefulness, influencing factors, and future outlook of LLM from a computer science student's perspective. Our findings suggest that LLMs can play a crucial role in the early stages of software development, especially in generating foundational code structures, and helping with syntax and error debugging. These insights provide us with a framework on how to effectively utilize LLMs as a tool to enhance the productivity of software engineering students, and highlight the necessity of shifting the educational focus toward preparing students for successful human-AI collaboration.

An Empirical Study on Usage and Perceptions of LLMs in a Software Engineering Project

TL;DR

This study investigates the usefulness of large language models in an academic software engineering project by analyzing AI-generated code, prompts, and human intervention across a 214-student, team-based SPA project in the SIMPLE language. It combines artifact analysis with a UTAUT-based perception survey to assess adoption, usefulness, and influencing factors. Findings show LLMs effectively bootstrap early development and debugging, with adoption contingent on coding skill and prior AI experience, while maintaining comparable code quality when proper human vetting is applied. The work offers educators a blueprint for integrating human-AI collaboration into software engineering curricula and highlights the need to teach prompt engineering and critical evaluation of AI-generated code.

Abstract

Large Language Models (LLMs) represent a leap in artificial intelligence, excelling in tasks using human language(s). Although the main focus of general-purpose LLMs is not code generation, they have shown promising results in the domain. However, the usefulness of LLMs in an academic software engineering project has not been fully explored yet. In this study, we explore the usefulness of LLMs for 214 students working in teams consisting of up to six members. Notably, in the academic course through which this study is conducted, students were encouraged to integrate LLMs into their development tool-chain, in contrast to most other academic courses that explicitly prohibit the use of LLMs. In this paper, we analyze the AI-generated code, prompts used for code generation, and the human intervention levels to integrate the code into the code base. We also conduct a perception study to gain insights into the perceived usefulness, influencing factors, and future outlook of LLM from a computer science student's perspective. Our findings suggest that LLMs can play a crucial role in the early stages of software development, especially in generating foundational code structures, and helping with syntax and error debugging. These insights provide us with a framework on how to effectively utilize LLMs as a tool to enhance the productivity of software engineering students, and highlight the necessity of shifting the educational focus toward preparing students for successful human-AI collaboration.
Paper Structure (13 sections, 6 figures, 6 tables)

This paper contains 13 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: High-level software architecture of the Static Program Analyzer (SPA), the software to be developed by the students
  • Figure 2: The timeline of the software engineering project, highlighting important milestones along with specific tasks in each phase. The development activities for the project span from Week 1-13, with three key milestones that need to be achieved by the students.
  • Figure 3: Unified Theory of Acceptance and Use of Technology (UTAUT) model structure for the perception study
  • Figure 4: Self Assessed Personal Traits
  • Figure 5: Levels of intervention performed on different generators' output
  • ...and 1 more figures