Just Leaf It: Accelerating Diffusion Classifiers with Hierarchical Class Pruning
Arundhati S. Shanbhag, Brian B. Moser, Tobias C. Nauen, Stanislav Frolov, Federico Raue, Andreas Dengel
TL;DR
This work tackles the high computational cost of diffusion-based zero-shot classifiers that must evaluate many candidate classes. It introduces the Hierarchical Diffusion Classifier (HDC), which performs a hierarchical pruning over a label tree to progressively discard unlikely categories before applying the diffusion classifier to leaf nodes, resulting in up to ~60% faster inference with similar or improved accuracy. The approach leverages WordNet-derived hierarchies (ImageNet-1K) and flexible pruning strategies (fixed and dynamic), and is compatible with multiple Stable Diffusion backbones and prompt templates. The practical impact is a tunable speed–accuracy trade-off that enables scalable, training-free diffusion classification for large-scale tasks, with considerations for open-set dynamics and future improvements in hierarchy construction and efficiency.
Abstract
Diffusion models, celebrated for their generative capabilities, have recently demonstrated surprising effectiveness in image classification tasks by using Bayes' theorem. Yet, current diffusion classifiers must evaluate every label candidate for each input, creating high computational costs that impede their use in large-scale applications. To address this limitation, we propose a Hierarchical Diffusion Classifier (HDC) that exploits hierarchical label structures or well-defined parent-child relationships in the dataset. By pruning irrelevant high-level categories and refining predictions only within relevant subcategories (leaf nodes and sub-trees), HDC reduces the total number of class evaluations. As a result, HDC can speed up inference by as much as 60% while preserving and sometimes even improving classification accuracy. In summary, our work provides a tunable control mechanism between speed and precision, making diffusion-based classification more feasible for large-scale applications.
