Table of Contents
Fetching ...

Utilizing Local Hierarchy with Adversarial Training for Hierarchical Text Classification

Zihan Wang, Peiyi Wang, Houfeng Wang

TL;DR

Hierarchical text classification (HTC) faces challenges from imbalanced labels and complex taxonomies. This work introduces HiAdv, a hierarchy-aware adversarial framework that integrates a local per-input hierarchy into existing HTC models by aligning their representations with a local-hierarchy-informed oracle via adversarial training. HiAdv is model-agnostic, demonstrated to consistently boost performance for both weaker and stronger backbones, especially on deep hierarchies and rare classes, and it surpasses prior local-hierarchy methods. The approach yields state-of-the-art results on challenging datasets and maintains inference efficiency, making it a practical enhancement for real-world HTC systems.

Abstract

Hierarchical text classification (HTC) is a challenging subtask of multi-label classification due to its complex taxonomic structure. Nearly all recent HTC works focus on how the labels are structured but ignore the sub-structure of ground-truth labels according to each input text which contains fruitful label co-occurrence information. In this work, we introduce this local hierarchy with an adversarial framework. We propose a HiAdv framework that can fit in nearly all HTC models and optimize them with the local hierarchy as auxiliary information. We test on two typical HTC models and find that HiAdv is effective in all scenarios and is adept at dealing with complex taxonomic hierarchies. Further experiments demonstrate that the promotion of our framework indeed comes from the local hierarchy and the local hierarchy is beneficial for rare classes which have insufficient training data.

Utilizing Local Hierarchy with Adversarial Training for Hierarchical Text Classification

TL;DR

Hierarchical text classification (HTC) faces challenges from imbalanced labels and complex taxonomies. This work introduces HiAdv, a hierarchy-aware adversarial framework that integrates a local per-input hierarchy into existing HTC models by aligning their representations with a local-hierarchy-informed oracle via adversarial training. HiAdv is model-agnostic, demonstrated to consistently boost performance for both weaker and stronger backbones, especially on deep hierarchies and rare classes, and it surpasses prior local-hierarchy methods. The approach yields state-of-the-art results on challenging datasets and maintains inference efficiency, making it a practical enhancement for real-world HTC systems.

Abstract

Hierarchical text classification (HTC) is a challenging subtask of multi-label classification due to its complex taxonomic structure. Nearly all recent HTC works focus on how the labels are structured but ignore the sub-structure of ground-truth labels according to each input text which contains fruitful label co-occurrence information. In this work, we introduce this local hierarchy with an adversarial framework. We propose a HiAdv framework that can fit in nearly all HTC models and optimize them with the local hierarchy as auxiliary information. We test on two typical HTC models and find that HiAdv is effective in all scenarios and is adept at dealing with complex taxonomic hierarchies. Further experiments demonstrate that the promotion of our framework indeed comes from the local hierarchy and the local hierarchy is beneficial for rare classes which have insufficient training data.
Paper Structure (31 sections, 15 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 31 sections, 15 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: A demonstration of our adversarial framework. A generator and an encoder use global and local hierarchy as input respectively, and the output representations are trained adversarially.
  • Figure 2: Two HTC models and an abstract HTC model. (a) HiBERT. Feed BERT output text representation into a graph encoder. (b) HPT. A prompt tuning model with a hierarchy-aware template. (c) An abstract model. Two encoders dealing with text and structure respectively and a mixture mechanism for a mixed representation.
  • Figure 3: A demonstration of our adversarial framework. The generator and the encoder share the same text encoder. We omit the classifier for clarity, which takes $\mathbf{h}_\mathrm{mix}$ and $\hat{\mathbf{h}}_\mathrm{mix}$ for input to classify and generates classification losses $L_\mathrm{C}$ and $\hat{L}_\mathrm{C}$ during training.
  • Figure 4: Macro F1 scores of label clusters on the development set of NYT. (a) Label clusters grouped by the number of training samples. >80% means this cluster of labels has training instances of more than 80% of labels. The rest are arranged similarly. (b) Label clusters grouped by depth in the hierarchy.