Table of Contents
Fetching ...

A First Look at GPT Apps: Landscape and Vulnerability

Zejun Zhang, Li Zhang, Xin Yuan, Anlan Zhang, Mengwei Xu, Feng Qian

TL;DR

This paper presents the first large-scale longitudinal study of the GPT app ecosystem across three stores, leveraging automated web-scraping and a novel TriLevel configuration extraction to map landscape dynamics, user engagement, and configuration vulnerabilities. Key findings show rising user enthusiasm but plateauing creator activity, widespread exposure of system prompts (driving plagiarism), and pronounced engagement skew toward a small subset of apps. The work introduces a scalable dataset and the TriLevel method, offering actionable guidance for stores, developers, and users to enhance security, discovery, and ecosystem health. The study underscores the need for continuous monitoring, stricter app review, and incentives to sustain secure, diverse GPT applications.

Abstract

Following OpenAI's introduction of GPTs, a surge in GPT apps has led to the launch of dedicated LLM app stores. Nevertheless, given its debut, there is a lack of sufficient understanding of this new ecosystem. To fill this gap, this paper presents a first comprehensive longitudinal (5-month) study of the evolution, landscape, and vulnerability of the emerging LLM app ecosystem, focusing on two GPT app stores: \textit{GPTStore.AI} and the official \textit{OpenAI GPT Store}. Specifically, we develop two automated tools and a TriLevel configuration extraction strategy to efficiently gather metadata (\ie names, creators, descriptions, \etc) and user feedback for all GPT apps across these two stores, as well as configurations (\ie system prompts, knowledge files, and APIs) for the top 10,000 popular apps. Our extensive analysis reveals: (1) the user enthusiasm for GPT apps consistently rises, whereas creator interest plateaus within three months of GPTs' launch; (2) nearly 90\% system prompts can be easily accessed due to widespread failure to secure GPT app configurations, leading to considerable plagiarism and duplication among apps. Our findings highlight the necessity of enhancing the LLM app ecosystem by the app stores, creators, and users.

A First Look at GPT Apps: Landscape and Vulnerability

TL;DR

This paper presents the first large-scale longitudinal study of the GPT app ecosystem across three stores, leveraging automated web-scraping and a novel TriLevel configuration extraction to map landscape dynamics, user engagement, and configuration vulnerabilities. Key findings show rising user enthusiasm but plateauing creator activity, widespread exposure of system prompts (driving plagiarism), and pronounced engagement skew toward a small subset of apps. The work introduces a scalable dataset and the TriLevel method, offering actionable guidance for stores, developers, and users to enhance security, discovery, and ecosystem health. The study underscores the need for continuous monitoring, stricter app review, and incentives to sustain secure, diverse GPT applications.

Abstract

Following OpenAI's introduction of GPTs, a surge in GPT apps has led to the launch of dedicated LLM app stores. Nevertheless, given its debut, there is a lack of sufficient understanding of this new ecosystem. To fill this gap, this paper presents a first comprehensive longitudinal (5-month) study of the evolution, landscape, and vulnerability of the emerging LLM app ecosystem, focusing on two GPT app stores: \textit{GPTStore.AI} and the official \textit{OpenAI GPT Store}. Specifically, we develop two automated tools and a TriLevel configuration extraction strategy to efficiently gather metadata (\ie names, creators, descriptions, \etc) and user feedback for all GPT apps across these two stores, as well as configurations (\ie system prompts, knowledge files, and APIs) for the top 10,000 popular apps. Our extensive analysis reveals: (1) the user enthusiasm for GPT apps consistently rises, whereas creator interest plateaus within three months of GPTs' launch; (2) nearly 90\% system prompts can be easily accessed due to widespread failure to secure GPT app configurations, leading to considerable plagiarism and duplication among apps. Our findings highlight the necessity of enhancing the LLM app ecosystem by the app stores, creators, and users.
Paper Structure (24 sections, 17 figures, 3 tables)

This paper contains 24 sections, 17 figures, 3 tables.

Figures (17)

  • Figure 1: Configurations for building GPT Apps
  • Figure 2: The overview of Methodology
  • Figure 3: Distribution of System Prompts Length in Top 10,000 GPT apps.
  • Figure 4: Popular Types of Knowledge Files in Top 10,000 GPT apps.
  • Figure 5: Success Rate for Acquiring System Prompts in the Top 10,000 Apps and Weekly Top 500 Apps.
  • ...and 12 more figures