Agent Skills: A Data-Driven Analysis of Claude Skills for Extending Large Language Model Functionality

George Ling; Shanshan Zhong; Richard Huang

Agent Skills: A Data-Driven Analysis of Claude Skills for Extending Large Language Model Functionality

George Ling, Shanshan Zhong, Richard Huang

TL;DR

The paper quantifies the emergent agent skills ecosystem by analyzing 40,285 publicly listed skills from a major marketplace, examining growth, length, redundancy, usage, and safety implications. Using a data-driven approach, it collects metadata, uses token length as a proxy for prompt budgets, and applies a two-level taxonomy plus LLM-based risk auditing to characterize the landscape. Key findings include rapid, bursty growth synchronized with community attention signals, a heavy-tailed yet generally compact skill length distribution, and substantial near-duplicate listings that complicate discovery and maintenance. Safety analysis reveals a majority of skills pose low risk, but a non-trivial subset enables state-changing or system-level actions, underscoring the need for least-privilege design, sandboxing, and clearer risk labeling to accompany skill reuse at scale. These insights inform directions for standardization, de-duplication, and governance to realize scalable, safe reuse of agent skills as infrastructure for LLM agents.

Abstract

Agent skills extend large language model (LLM) agents with reusable, program-like modules that define triggering conditions, procedural logic, and tool interactions. As these skills proliferate in public marketplaces, it is unclear what types are available, how users adopt them, and what risks they pose. To answer these questions, we conduct a large-scale, data-driven analysis of 40,285 publicly listed skills from a major marketplace. Our results show that skill publication tends to occur in short bursts that track shifts in community attention. We also find that skill content is highly concentrated in software engineering workflows, while information retrieval and content creation account for a substantial share of adoption. Beyond content trends, we uncover a pronounced supply-demand imbalance across categories, and we show that most skills remain within typical prompt budgets despite a heavy-tailed length distribution. Finally, we observe strong ecosystem homogeneity, with widespread intent-level redundancy, and we identify non-trivial safety risks, including skills that enable state-changing or system-level actions. Overall, our findings provide a quantitative snapshot of agent skills as an emerging infrastructure layer for agents and inform future work on skill reuse, standardization, and safety-aware design.

Agent Skills: A Data-Driven Analysis of Claude Skills for Extending Large Language Model Functionality

TL;DR

Abstract

Paper Structure (36 sections, 15 figures, 1 table)

This paper contains 36 sections, 15 figures, 1 table.

Introduction
Skill Data and Growth Trends
Data Collection
Skill Growth Trends
Growth is rapid and bursty.
Growth aligns with an application-level popularity signal.
Skill Length and Redundancy
Skill Length Characteristics
Typical skills are short.
A few skills are long due to the inclusion of multiple components.
Intent-level Redundancy Analysis
Redundancy measuring.
Nearly half of listings are duplicates.
Implications for discovery and maintenance.
Skill Usage Patterns
...and 21 more sections

Figures (15)

Figure 1: Growth trend of agent skills. According to the well-known agent skills platform https://skill.sh, the number of recorded skills experienced rapid growth from mid-January to early February 2026, exceeding 40,000 by early February. During the same period, the popular open-source skills application OpenClaw openclaw saw a sharp surge in GitHub stars, reaching over 25,000 stars in a single day at the end of January, followed by a gradual decline, with the total number of stars exceeding 170k.
Figure 2: Token-count distribution of agent skills. The distribution is heavy-tailed: the median skill contains 1,414 tokens (mean: 1,895). 90% of skills are no longer than 3,935 tokens and 99% are no longer than 9,253 tokens. A small fraction are exceptionally long, with a maximum of 116,239 tokens.
Figure 3: Name based redundancy distribution. Skills are grouped by normalized names using case insensitive matching after removing special characters. We report the fraction of skills that appear $n$ times, denoted as $n\times$. Skills that appear once account for 53.7%, while skills that appear more than once account for 46.3%. The names of the 30 most redundant skills under this metric are listed in Appendix Figure \ref{['fig:top_redundant']}.
Figure 4: Top 30 redundant skills by name based matching. We rank skills by the number of listings that share the same normalized name, using the exact matching procedure in Section \ref{['sec:redundancy']}. This figure lists the 30 most frequently repeated skill names and their repetition counts.
Figure 5: Word clouds of skill names by major category. For each major category in Section \ref{['sec:content']}, we show the most frequent terms derived from the skill document. Words are retained only if their frequency in the target category exceeds $1.5$ times the average of the remaining five categories. Font size is proportional to within-category frequency, highlighting common topics and recurring workflow motifs.
...and 10 more figures

Agent Skills: A Data-Driven Analysis of Claude Skills for Extending Large Language Model Functionality

TL;DR

Abstract

Agent Skills: A Data-Driven Analysis of Claude Skills for Extending Large Language Model Functionality

Authors

TL;DR

Abstract

Table of Contents

Figures (15)