From Haystack to Needle: Label Space Reduction for Zero-shot Classification
Nathan Vandemoortele, Bram Steenwinckel, Femke Ongenae, Sofie Van Hoecke
TL;DR
Zero-shot classification with large label spaces challenges LLMs due to attention and reasoning constraints. The paper introduces Label Space Reduction (LSR), an iterative framework that jointly ranks and prunes candidate labels and uses a probabilistic classifier distilled from LSR to enable efficient inference. Across seven benchmarks and multiple LLMs, LSR yields substantial macro-F1 gains (averaging around 7% with Llama-3.1-70B and up to 14% on some tasks) and robust improvements over standard zero-shot baselines. The approach demonstrates practical impact by enabling competitive performance with distillation, and provides directions for automation and extension to multi-label settings.
Abstract
We present Label Space Reduction (LSR), a novel method for improving zero-shot classification performance of Large Language Models (LLMs). LSR iteratively refines the classification label space by systematically ranking and reducing candidate classes, enabling the model to concentrate on the most relevant options. By leveraging unlabeled data with the statistical learning capabilities of data-driven models, LSR dynamically optimizes the label space representation at test time. Our experiments across seven benchmarks demonstrate that LSR improves macro-F1 scores by an average of 7.0% (up to 14.2%) with Llama-3.1-70B and 3.3% (up to 11.1%) with Claude-3.5-Sonnet compared to standard zero-shot classification baselines. To reduce the computational overhead of LSR, which requires an additional LLM call at each iteration, we propose distilling the model into a probabilistic classifier, allowing for efficient inference.
