Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples
Qingkai Zeng, Yuyang Bai, Zhaoxuan Tan, Shangbin Feng, Zhenwen Liang, Zhihan Zhang, Meng Jiang
TL;DR
Chain-of-Layer (CoL) introduces a layer-wise, in-context learning framework for taxonomy induction that uses Hierarchical Format Taxonomy Induction Instructions (HF) and an Ensemble-based Ranking Filter to mitigate hallucinations. By splitting the task into top-down layers and validating each step with an ensemble of templates and a masked-language model, CoL achieves state-of-the-art performance on WordNet sub-taxonomies and three large-scale taxonomies, while CoL-Zero demonstrates strong cross-domain adaptability. The approach systematically improves both precision and structural coherence over single-pass prompting methods, and ablation studies highlight the complementary roles of CoL and the filter. Overall, CoL offers a scalable, interpretable paradigm for constructing coherent taxonomies from limited example sets, with practical impact for search, recommendation, and QA systems.
Abstract
Automatic taxonomy induction is crucial for web search, recommendation systems, and question answering. Manual curation of taxonomies is expensive in terms of human effort, making automatic taxonomy construction highly desirable. In this work, we introduce Chain-of-Layer which is an in-context learning framework designed to induct taxonomies from a given set of entities. Chain-of-Layer breaks down the task into selecting relevant candidate entities in each layer and gradually building the taxonomy from top to bottom. To minimize errors, we introduce the Ensemble-based Ranking Filter to reduce the hallucinated content generated at each iteration. Through extensive experiments, we demonstrate that Chain-of-Layer achieves state-of-the-art performance on four real-world benchmarks.
