Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples

Qingkai Zeng; Yuyang Bai; Zhaoxuan Tan; Shangbin Feng; Zhenwen Liang; Zhihan Zhang; Meng Jiang

Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples

Qingkai Zeng, Yuyang Bai, Zhaoxuan Tan, Shangbin Feng, Zhenwen Liang, Zhihan Zhang, Meng Jiang

TL;DR

Chain-of-Layer (CoL) introduces a layer-wise, in-context learning framework for taxonomy induction that uses Hierarchical Format Taxonomy Induction Instructions (HF) and an Ensemble-based Ranking Filter to mitigate hallucinations. By splitting the task into top-down layers and validating each step with an ensemble of templates and a masked-language model, CoL achieves state-of-the-art performance on WordNet sub-taxonomies and three large-scale taxonomies, while CoL-Zero demonstrates strong cross-domain adaptability. The approach systematically improves both precision and structural coherence over single-pass prompting methods, and ablation studies highlight the complementary roles of CoL and the filter. Overall, CoL offers a scalable, interpretable paradigm for constructing coherent taxonomies from limited example sets, with practical impact for search, recommendation, and QA systems.

Abstract

Automatic taxonomy induction is crucial for web search, recommendation systems, and question answering. Manual curation of taxonomies is expensive in terms of human effort, making automatic taxonomy construction highly desirable. In this work, we introduce Chain-of-Layer which is an in-context learning framework designed to induct taxonomies from a given set of entities. Chain-of-Layer breaks down the task into selecting relevant candidate entities in each layer and gradually building the taxonomy from top to bottom. To minimize errors, we introduce the Ensemble-based Ranking Filter to reduce the hallucinated content generated at each iteration. Through extensive experiments, we demonstrate that Chain-of-Layer achieves state-of-the-art performance on four real-world benchmarks.

Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples

TL;DR

Abstract

Paper Structure (27 sections, 6 equations, 8 figures, 4 tables)

This paper contains 27 sections, 6 equations, 8 figures, 4 tables.

Introduction
Problem Definition
Methodology
Hierarchical Format Taxonomy Induction Instruction (HF)
Few-shot Demonstration Construction
Inference via Chain-of-Layer
Ensemble-based Ranking Filter
Iterative Inference
Demonstrations Generation via LLMs
Experiments
Experimental Setting
Datasets
Baseline Methods
Evaluation Metrics
Results on the WordNet (RQ1)
...and 12 more sections

Figures (8)

Figure 1: Two Types of Methods for Taxonomy Induction
Figure 2: The overview of the framework for Chain-of-Layer (CoL): Given an entity list $\mathcal{V}$ and a root entity $v_0 \in \mathcal{V}$, CoL systematically organizes the entities in $\mathcal{V}$ into hierarchical groups, incrementally adding them to the taxonomy in a top-down manner at each iteration. In detail, at the $k$-th iteration, CoL-K selects a subset of entities $\mathcal{V}{\text{sel}}$ from the k-level and extends the existing taxonomy $\mathcal{T}^{k-1}$ with these entities. The newly generated parent-child relations ($\mathcal{T}^{k} \setminus \mathcal{T}^{k-1}$) are refined by an Ensemble-based Ranking Filter to reduce the hallucinations into the output taxonomy $\mathcal{T}^{k}$ in $k$-th iteration. The process continues until all entities in $\mathcal{V}$ are integrated into the resulting taxonomy.
Figure 3: Prompt Overview of Chain-of-Layer Framework
Figure 4: The details of the Ensemble-based Ranking Filter.
Figure 5: Performance analysis of the CoL across varying scales and domains. It shows Edge, Ancestor, and Node F1-scores for Wiki, DBLP, and SemEval-Sci taxonomies, ranging from 20 to 160 entities. An inflection point at the 80-entity threshold across all metrics and domains, emphasizing the scalability limitations of CoL.
...and 3 more figures

Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples

TL;DR

Abstract

Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples

Authors

TL;DR

Abstract

Table of Contents

Figures (8)