Agent Skills: A Data-Driven Analysis of Claude Skills for Extending Large Language Model Functionality
George Ling, Shanshan Zhong, Richard Huang
TL;DR
The paper quantifies the emergent agent skills ecosystem by analyzing 40,285 publicly listed skills from a major marketplace, examining growth, length, redundancy, usage, and safety implications. Using a data-driven approach, it collects metadata, uses token length as a proxy for prompt budgets, and applies a two-level taxonomy plus LLM-based risk auditing to characterize the landscape. Key findings include rapid, bursty growth synchronized with community attention signals, a heavy-tailed yet generally compact skill length distribution, and substantial near-duplicate listings that complicate discovery and maintenance. Safety analysis reveals a majority of skills pose low risk, but a non-trivial subset enables state-changing or system-level actions, underscoring the need for least-privilege design, sandboxing, and clearer risk labeling to accompany skill reuse at scale. These insights inform directions for standardization, de-duplication, and governance to realize scalable, safe reuse of agent skills as infrastructure for LLM agents.
Abstract
Agent skills extend large language model (LLM) agents with reusable, program-like modules that define triggering conditions, procedural logic, and tool interactions. As these skills proliferate in public marketplaces, it is unclear what types are available, how users adopt them, and what risks they pose. To answer these questions, we conduct a large-scale, data-driven analysis of 40,285 publicly listed skills from a major marketplace. Our results show that skill publication tends to occur in short bursts that track shifts in community attention. We also find that skill content is highly concentrated in software engineering workflows, while information retrieval and content creation account for a substantial share of adoption. Beyond content trends, we uncover a pronounced supply-demand imbalance across categories, and we show that most skills remain within typical prompt budgets despite a heavy-tailed length distribution. Finally, we observe strong ecosystem homogeneity, with widespread intent-level redundancy, and we identify non-trivial safety risks, including skills that enable state-changing or system-level actions. Overall, our findings provide a quantitative snapshot of agent skills as an emerging infrastructure layer for agents and inform future work on skill reuse, standardization, and safety-aware design.
