Table of Contents
Fetching ...

An Analysis of Automated Use Case Component Extraction from Scenarios using ChatGPT

Pragyan KC, Rocky Slavin, Sepideh Ghanavati, Travis Breaux, Mitra Bokaei Hosseini

TL;DR

Small mobile app teams struggle with evolving requirements and rely on limited elicitation methods. This work collects a 50-scenario corpus of user-authored scenarios and evaluates ChatGPT for automated extraction of use case components, using seed prompts and an exploratory prompt-engineering study. Results show that domain knowledge is crucial for accurately extracting data practices, with lexical exact-match ($EM$) performing poorly but $F_1$-score and cosine-based semantic similarity ($SM$) providing better signals; preprocessing further improves performance. Prompt refinements that embed domain knowledge improve UC component quality, suggesting a practical path toward low-cost, privacy-aware requirements extraction to support rapid, iterative mobile app development.

Abstract

Mobile applications (apps) are often developed by only a small number of developers with limited resources, especially in the early years of the app's development. In this setting, many requirements acquisition activities, such as interviews, are challenging or lower priority than development and release activities. Moreover, in this early period, requirements are frequently changing as mobile apps evolve to compete in the marketplace. As app development companies move to standardize their development processes, however, they will shift to documenting and analyzing requirements. One low-cost source of requirements post-deployment are user-authored scenarios describing how they interact with an app. We propose a method for extracting use case components from user-authored scenarios using large language models (LLMs). The method consists of a series of prompts that were developed to improve precision and recall on a ground truth dataset of 50 scenarios independently labeled with UC components. Our results reveal that LLMs require additional domain knowledge to extract UC components, and that refining prompts to include this knowledge improves the quality of the extracted UC components.

An Analysis of Automated Use Case Component Extraction from Scenarios using ChatGPT

TL;DR

Small mobile app teams struggle with evolving requirements and rely on limited elicitation methods. This work collects a 50-scenario corpus of user-authored scenarios and evaluates ChatGPT for automated extraction of use case components, using seed prompts and an exploratory prompt-engineering study. Results show that domain knowledge is crucial for accurately extracting data practices, with lexical exact-match () performing poorly but -score and cosine-based semantic similarity () providing better signals; preprocessing further improves performance. Prompt refinements that embed domain knowledge improve UC component quality, suggesting a practical path toward low-cost, privacy-aware requirements extraction to support rapid, iterative mobile app development.

Abstract

Mobile applications (apps) are often developed by only a small number of developers with limited resources, especially in the early years of the app's development. In this setting, many requirements acquisition activities, such as interviews, are challenging or lower priority than development and release activities. Moreover, in this early period, requirements are frequently changing as mobile apps evolve to compete in the marketplace. As app development companies move to standardize their development processes, however, they will shift to documenting and analyzing requirements. One low-cost source of requirements post-deployment are user-authored scenarios describing how they interact with an app. We propose a method for extracting use case components from user-authored scenarios using large language models (LLMs). The method consists of a series of prompts that were developed to improve precision and recall on a ground truth dataset of 50 scenarios independently labeled with UC components. Our results reveal that LLMs require additional domain knowledge to extract UC components, and that refining prompts to include this knowledge improves the quality of the extracted UC components.
Paper Structure (28 sections, 4 figures, 9 tables)

This paper contains 28 sections, 4 figures, 9 tables.

Figures (4)

  • Figure 1: The Overview of Scenario and UC Component Collection Approach
  • Figure 2: A User-Authored Scenario Example
  • Figure 3: Seed Prompt
  • Figure 4: Scenarios and UC Defects Examples