A First Look at GPT Apps: Landscape and Vulnerability
Zejun Zhang, Li Zhang, Xin Yuan, Anlan Zhang, Mengwei Xu, Feng Qian
TL;DR
This paper presents the first large-scale longitudinal study of the GPT app ecosystem across three stores, leveraging automated web-scraping and a novel TriLevel configuration extraction to map landscape dynamics, user engagement, and configuration vulnerabilities. Key findings show rising user enthusiasm but plateauing creator activity, widespread exposure of system prompts (driving plagiarism), and pronounced engagement skew toward a small subset of apps. The work introduces a scalable dataset and the TriLevel method, offering actionable guidance for stores, developers, and users to enhance security, discovery, and ecosystem health. The study underscores the need for continuous monitoring, stricter app review, and incentives to sustain secure, diverse GPT applications.
Abstract
Following OpenAI's introduction of GPTs, a surge in GPT apps has led to the launch of dedicated LLM app stores. Nevertheless, given its debut, there is a lack of sufficient understanding of this new ecosystem. To fill this gap, this paper presents a first comprehensive longitudinal (5-month) study of the evolution, landscape, and vulnerability of the emerging LLM app ecosystem, focusing on two GPT app stores: \textit{GPTStore.AI} and the official \textit{OpenAI GPT Store}. Specifically, we develop two automated tools and a TriLevel configuration extraction strategy to efficiently gather metadata (\ie names, creators, descriptions, \etc) and user feedback for all GPT apps across these two stores, as well as configurations (\ie system prompts, knowledge files, and APIs) for the top 10,000 popular apps. Our extensive analysis reveals: (1) the user enthusiasm for GPT apps consistently rises, whereas creator interest plateaus within three months of GPTs' launch; (2) nearly 90\% system prompts can be easily accessed due to widespread failure to secure GPT app configurations, leading to considerable plagiarism and duplication among apps. Our findings highlight the necessity of enhancing the LLM app ecosystem by the app stores, creators, and users.
