Automated Skill Decomposition Meets Expert Ontologies: Bridging the Granularity Gap with LLMs
Le Ngoc Luyen, Marie-Hélène Abel
TL;DR
The paper tackles the granularity gap between broad skill labels in authoritative ontologies and the finer sub-skills needed for adaptive learning and workforce mapping. It proposes an ontology-grounded evaluation framework and introduces the ROME-ESCO-DecompSkill benchmark to assess LLM-based skill decomposition under zero-shot and leakage-safe few-shot prompting. Through semantic F1 and hierarchy-aware F1 metrics, the study shows zero-shot provides solid baselines while few-shot prompts improve phrasing stability and taxonomic placement, with exemplar choice influencing latency and coverage for different model sizes. The findings offer a reproducible foundation for ontology-faithful skill decomposition with practical implications for curriculum design, personalized learning, and employment services, and point to future work in retrieval-augmented grounding and multilingual deployment.
Abstract
This paper investigates automated skill decomposition using Large Language Models (LLMs) and proposes a rigorous, ontology-grounded evaluation framework. Our framework standardizes the pipeline from prompting and generation to normalization and alignment with ontology nodes. To evaluate outputs, we introduce two metrics: a semantic F1-score that uses optimal embedding-based matching to assess content accuracy, and a hierarchy-aware F1-score that credits structurally correct placements to assess granularity. We conduct experiments on ROME-ESCO-DecompSkill, a curated subset of parents, comparing two prompting strategies: zero-shot and leakage-safe few-shot with exemplars. Across diverse LLMs, zero-shot offers a strong baseline, while few-shot consistently stabilizes phrasing and granularity and improves hierarchy-aware alignment. A latency analysis further shows that exemplar-guided prompts are competitive - and sometimes faster - than unguided zero-shot due to more schema-compliant completions. Together, the framework, benchmark, and metrics provide a reproducible foundation for developing ontology-faithful skill decomposition systems.
