An Empirical Study on Challenges for LLM Application Developers

Xiang Chen; Chaoyang Gao; Chunyang Chen; Guangbei Zhang; Yong Liu

An Empirical Study on Challenges for LLM Application Developers

Xiang Chen, Chaoyang Gao, Chunyang Chen, Guangbei Zhang, Yong Liu

TL;DR

By mining 29,057 questions from the OpenAI developer forum and analyzing 2,364 sampled posts, the paper constructs a 6-by-26 taxonomy of challenges faced by LLM application developers. It finds that API usage and general questions dominate, while non-functional properties and advanced prompting techniques pose substantial issues, and prompting-related practices are underutilized. The methodology generalizes to GitHub issues across other LLMs, demonstrating cross-platform relevance and platform-specific differences. The study provides actionable guidance for developers and vendors, including enhanced documentation, tooling for cost and rate management, and support for GPT Builder development.

Abstract

In recent years, large language models (LLMs) have seen rapid advancements, significantly impacting various fields such as computer vision, natural language processing, and software engineering. These LLMs, exemplified by OpenAI's ChatGPT, have revolutionized the way we approach language understanding and generation tasks. However, in contrast to traditional software development practices, LLM development introduces new challenges for AI developers in design, implementation, and deployment. These challenges span different areas (such as prompts, APIs, and plugins), requiring developers to navigate unique methodologies and considerations specific to LLM application development. Despite the profound influence of LLMs, to the best of our knowledge, these challenges have not been thoroughly investigated in previous empirical studies. To fill this gap, we present the first comprehensive study on understanding the challenges faced by LLM developers. Specifically, we crawl and analyze 29,057 relevant questions from a popular OpenAI developer forum. We first examine their popularity and difficulty. After manually analyzing 2,364 sampled questions, we construct a taxonomy of challenges faced by LLM developers. Based on this taxonomy, we summarize a set of findings and actionable implications for LLM-related stakeholders, including developers and providers (especially the OpenAI organization).

An Empirical Study on Challenges for LLM Application Developers

TL;DR

Abstract

Paper Structure (23 sections, 19 figures)

This paper contains 23 sections, 19 figures.

Introduction
Background
Methodology
RQ1: Popularity Trend Analysis
RQ2: Difficulty Level Analysis
RQ3: Challenge Taxonomy Construction
General Questions
API
Generation and Understanding
Non-functional Properties
GPT Builder
Prompt
RQ4: Generalization of Our Methodology
Discussions
Findings and Implications of Our Empirical Study
...and 8 more sections

Figures (19)

Figure 1: Methodology overview of our empirical study.
Figure 2: A question post from the OpenAI developer forum.
Figure 3: The number of new posts and users every three months since creating the OpenAI developer forum. Notice, in the $y$-axis, we apply a logarithmic transformation to the actual numbers for a clearer representation of the results
Figure 4: The number of posts with different numbers of replies.
Figure 5: Proportion of posts with accepted solutions.
...and 14 more figures

An Empirical Study on Challenges for LLM Application Developers

TL;DR

Abstract

An Empirical Study on Challenges for LLM Application Developers

Authors

TL;DR

Abstract

Table of Contents

Figures (19)