Leveraging Large Language Models for Predicting Cost and Duration in Software Engineering Projects
Justin Carpenter, Chia-Ying Wu, Nasir U. Eisty
TL;DR
This study investigates using LLMs, specifically GPT-3.5, to predict software project cost and duration and compares them against traditional estimation methods and standard ML approaches across multiple datasets, including ISBSG, Desharnais, and COCOMO. By transforming structured dataset features into natural language prompts and employing careful data quality controls (missing-value stratification), the authors demonstrate that LLMs can achieve competitive accuracy and exhibit robustness to incomplete data, though they do not consistently outperform state-of-the-art ML models. The work contributes a data-centric, prompt-engineering-driven workflow for LLM-based estimation, highlighting potential for easier integration into project management practices and offering practical insights into when LLMs add value versus traditional methods. Overall, the findings suggest LLMs can simplify the estimation process and improve usability while delivering meaningful gains in certain scenarios, with data quality and dataset selection playing pivotal roles in predictive success.
Abstract
Accurate estimation of project costs and durations remains a pivotal challenge in software engineering, directly impacting budgeting and resource management. Traditional estimation techniques, although widely utilized, often fall short due to their complexity and the dynamic nature of software development projects. This study introduces an innovative approach using Large Language Models (LLMs) to enhance the accuracy and usability of project cost predictions. We explore the efficacy of LLMs against traditional methods and contemporary machine learning techniques, focusing on their potential to simplify the estimation process and provide higher accuracy. Our research is structured around critical inquiries into whether LLMs can outperform existing models, the ease of their integration into current practices, outperform traditional estimation, and why traditional methods still prevail in industry settings. By applying LLMs to a range of real-world datasets and comparing their performance to both state-of-the-art and conventional methods, this study aims to demonstrate that LLMs not only yield more accurate estimates but also offer a user-friendly alternative to complex predictive models, potentially transforming project management strategies within the software industry.
