Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

Ke Ji; Peng Wang; Wenjun Ke; Guozheng Li; Jiajun Liu; Jingsheng Gao; Ziyu Shang

Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

Ke Ji, Peng Wang, Wenjun Ke, Guozheng Li, Jiajun Liu, Jingsheng Gao, Ziyu Shang

TL;DR

The paper tackles few-shot hierarchical text classification by bridging unstructured PLM knowledge with domain hierarchies, addressing hierarchical inconsistencies. It introduces HierICRF, a three-stage pipeline that performs hierarchy-aware reasoning, generates hierarchically repeated series via a verbalizer, and enforces hierarchical consistency with a constrained hierarchical iterative CRF, decoded by Viterbi. Experiments on WOS and DBpedia demonstrate state-of-the-art performance under few-shot settings and superior hierarchical consistency across baselines, with the approach being architecture-agnostic and scalable to larger LMs. The work highlights the potential of domain-hierarchy adaptation through path routing as a general paradigm for aligning PLMs with structured downstream tasks, and points to future work extending HierICRF to non-tuning LLMs.

Abstract

Recently, various pre-trained language models (PLMs) have been proposed to prove their impressive performances on a wide range of few-shot tasks. However, limited by the unstructured prior knowledge in PLMs, it is difficult to maintain consistent performance on complex structured scenarios, such as hierarchical text classification (HTC), especially when the downstream data is extremely scarce. The main challenge is how to transfer the unstructured semantic space in PLMs to the downstream domain hierarchy. Unlike previous work on HTC which directly performs multi-label classification or uses graph neural network (GNN) to inject label hierarchy, in this work, we study the HTC problem under a few-shot setting to adapt knowledge in PLMs from an unstructured manner to the downstream hierarchy. Technically, we design a simple yet effective method named Hierarchical Iterative Conditional Random Field (HierICRF) to search the most domain-challenging directions and exquisitely crafts domain-hierarchy adaptation as a hierarchical iterative language modeling problem, and then it encourages the model to make hierarchical consistency self-correction during the inference, thereby achieving knowledge transfer with hierarchical consistency preservation. We perform HierICRF on various architectures, and extensive experiments on two popular HTC datasets demonstrate that prompt with HierICRF significantly boosts the few-shot HTC performance with an average Micro-F1 by 28.80% to 1.50% and Macro-F1 by 36.29% to 1.5% over the previous state-of-the-art (SOTA) baselines under few-shot settings, while remaining SOTA hierarchical consistency performance.

Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

TL;DR

Abstract

Paper Structure (22 sections, 5 equations, 2 figures, 7 tables, 1 algorithm)

This paper contains 22 sections, 5 equations, 2 figures, 7 tables, 1 algorithm.

Introduction
Related Work
Hierarchical Text Classification.
Methodology
Problem Statements
Framework Overview
Chain of Hierarchy-Aware Reasoning
Text to Hierarchically Repeated Series
Hierarchical Iterative Conditional Random Fields
Decoding
Experiments
Experimental Settings
Datasets and Evaluation Metrics.
Baselines.
Backbone and Implementation Details.
...and 7 more sections

Figures (2)

Figure 1: Illustration of methods for HTC. The red sequence represents the golden label, and the purple sequence represents the predicted sequence. Hierarchical inconsistency happens when the relationship between the outputs of different layers conflicts with the hierarchical dependency tree, for example, the model predicts the CP which is not the child node of its other output like Medicare.
Figure 2: The overview of HierICRF. There are two ways to inject hierarchical constraints: (a) Chain of hierarchy-aware reasoning and (b) Hierarchical iterative CRF. At stage 3, we select the predictions $\{y_3, y_4, y_5\}$ (the last path routing iteration) as the final outputs.

Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

TL;DR

Abstract

Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (2)