Table of Contents
Fetching ...

ICXML: An In-Context Learning Framework for Zero-Shot Extreme Multi-Label Classification

Yaxin Zhu, Hamed Zamani

TL;DR

ICXML tackles zero-shot extreme multi-label classification by a two-stage in-context learning pipeline: generation-based demonstration construction to short-list candidate labels, followed by LLM-based reranking to select final labels. The approach alleviates the infeasibility of prompting with millions of labels by first enriching context through demonstrations and then narrowing the label space via semantic mapping. It achieves state-of-the-art performance on two large benchmarks (LF-Amazon-131K and LF-WikiSeeAlso-320K) without relying on paired training data, and demonstrates robustness across model backbones (including GPT-3.5/4 and open models). Overall, ICXML highlights the potential of generation-guided in-context learning for scalable, weakly supervised extreme classification tasks.

Abstract

This paper focuses on the task of Extreme Multi-Label Classification (XMC) whose goal is to predict multiple labels for each instance from an extremely large label space. While existing research has primarily focused on fully supervised XMC, real-world scenarios often lack supervision signals, highlighting the importance of zero-shot settings. Given the large label space, utilizing in-context learning approaches is not trivial. We address this issue by introducing In-Context Extreme Multilabel Learning (ICXML), a two-stage framework that cuts down the search space by generating a set of candidate labels through incontext learning and then reranks them. Extensive experiments suggest that ICXML advances the state of the art on two diverse public benchmarks.

ICXML: An In-Context Learning Framework for Zero-Shot Extreme Multi-Label Classification

TL;DR

ICXML tackles zero-shot extreme multi-label classification by a two-stage in-context learning pipeline: generation-based demonstration construction to short-list candidate labels, followed by LLM-based reranking to select final labels. The approach alleviates the infeasibility of prompting with millions of labels by first enriching context through demonstrations and then narrowing the label space via semantic mapping. It achieves state-of-the-art performance on two large benchmarks (LF-Amazon-131K and LF-WikiSeeAlso-320K) without relying on paired training data, and demonstrates robustness across model backbones (including GPT-3.5/4 and open models). Overall, ICXML highlights the potential of generation-guided in-context learning for scalable, weakly supervised extreme classification tasks.

Abstract

This paper focuses on the task of Extreme Multi-Label Classification (XMC) whose goal is to predict multiple labels for each instance from an extremely large label space. While existing research has primarily focused on fully supervised XMC, real-world scenarios often lack supervision signals, highlighting the importance of zero-shot settings. Given the large label space, utilizing in-context learning approaches is not trivial. We address this issue by introducing In-Context Extreme Multilabel Learning (ICXML), a two-stage framework that cuts down the search space by generating a set of candidate labels through incontext learning and then reranks them. Extensive experiments suggest that ICXML advances the state of the art on two diverse public benchmarks.
Paper Structure (19 sections, 7 equations, 3 figures, 6 tables)

This paper contains 19 sections, 7 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: An illustration of the proposed generate-rerank framework. For a given test input $x_i$, we generate demonstrations $D_i$ to facilitate ICL-based shortlisting. Subsequently, this shortlisted set $\bar{Y}_i$ is provided to LLM for listwise re-ranking, culminating in the final results $Y^*_i$.
  • Figure 2: Pipeline of content-based and label-centric approaches for demonstration generation stage. In the content-based paradigm, demonstrations are generated through LLM, producing inputs denoted as $z_i^j$, followed by the selection of corresponding outputs $L_i^j$ from the label space. Conversely, in the label-centric paradigm, demonstration outputs $l_i^j$ are initially retrieved from the label space, subsequently leading to the determination of corresponding inputs $z_i^j$.
  • Figure 3: Results of different demonstration construction strategies on 200 samples from LF-Amazon-131K.