Table of Contents
Fetching ...

Just Leaf It: Accelerating Diffusion Classifiers with Hierarchical Class Pruning

Arundhati S. Shanbhag, Brian B. Moser, Tobias C. Nauen, Stanislav Frolov, Federico Raue, Andreas Dengel

TL;DR

This work tackles the high computational cost of diffusion-based zero-shot classifiers that must evaluate many candidate classes. It introduces the Hierarchical Diffusion Classifier (HDC), which performs a hierarchical pruning over a label tree to progressively discard unlikely categories before applying the diffusion classifier to leaf nodes, resulting in up to ~60% faster inference with similar or improved accuracy. The approach leverages WordNet-derived hierarchies (ImageNet-1K) and flexible pruning strategies (fixed and dynamic), and is compatible with multiple Stable Diffusion backbones and prompt templates. The practical impact is a tunable speed–accuracy trade-off that enables scalable, training-free diffusion classification for large-scale tasks, with considerations for open-set dynamics and future improvements in hierarchy construction and efficiency.

Abstract

Diffusion models, celebrated for their generative capabilities, have recently demonstrated surprising effectiveness in image classification tasks by using Bayes' theorem. Yet, current diffusion classifiers must evaluate every label candidate for each input, creating high computational costs that impede their use in large-scale applications. To address this limitation, we propose a Hierarchical Diffusion Classifier (HDC) that exploits hierarchical label structures or well-defined parent-child relationships in the dataset. By pruning irrelevant high-level categories and refining predictions only within relevant subcategories (leaf nodes and sub-trees), HDC reduces the total number of class evaluations. As a result, HDC can speed up inference by as much as 60% while preserving and sometimes even improving classification accuracy. In summary, our work provides a tunable control mechanism between speed and precision, making diffusion-based classification more feasible for large-scale applications.

Just Leaf It: Accelerating Diffusion Classifiers with Hierarchical Class Pruning

TL;DR

This work tackles the high computational cost of diffusion-based zero-shot classifiers that must evaluate many candidate classes. It introduces the Hierarchical Diffusion Classifier (HDC), which performs a hierarchical pruning over a label tree to progressively discard unlikely categories before applying the diffusion classifier to leaf nodes, resulting in up to ~60% faster inference with similar or improved accuracy. The approach leverages WordNet-derived hierarchies (ImageNet-1K) and flexible pruning strategies (fixed and dynamic), and is compatible with multiple Stable Diffusion backbones and prompt templates. The practical impact is a tunable speed–accuracy trade-off that enables scalable, training-free diffusion classification for large-scale tasks, with considerations for open-set dynamics and future improvements in hierarchy construction and efficiency.

Abstract

Diffusion models, celebrated for their generative capabilities, have recently demonstrated surprising effectiveness in image classification tasks by using Bayes' theorem. Yet, current diffusion classifiers must evaluate every label candidate for each input, creating high computational costs that impede their use in large-scale applications. To address this limitation, we propose a Hierarchical Diffusion Classifier (HDC) that exploits hierarchical label structures or well-defined parent-child relationships in the dataset. By pruning irrelevant high-level categories and refining predictions only within relevant subcategories (leaf nodes and sub-trees), HDC reduces the total number of class evaluations. As a result, HDC can speed up inference by as much as 60% while preserving and sometimes even improving classification accuracy. In summary, our work provides a tunable control mechanism between speed and precision, making diffusion-based classification more feasible for large-scale applications.

Paper Structure

This paper contains 19 sections, 9 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Comparison between the classical diffusion classifier and our proposed Hierarchical Diffusion Classifier (HDC). While the classical approach evaluates all possible classes to find the correct label, which leads to unnecessary computation, HDC prunes irrelevant classes early, focusing only on the most relevant candidates. This hierarchical pruning reduces computational overhead and accelerates inference.
  • Figure 2: Overview of our Hierarchical Diffusion Classifier (HDC). Starting with an input image $\mathbf{x}$, noise $\varepsilon \sim \mathcal{N}(0, I)$ is added to generate a noisy image, resulting in $\mathbf{x}_t$ for multiple timesteps $t$. Next, we use the diffusion classifier with a reduced number of $\varepsilon$-predictions and hierarchical conditioning prompts like "A photo of a {synclass / class name}" to progressively refine the classification through multiple levels of the label tree. By doing so, we keep track of the most promising classes (highlighted in green) and ignore the rest (highlighted in red). The set of selected nodes during the pruning stage is denoted as $\mathcal{S}^d_{\text{selected}}$, where $d$ denotes the step count during traversal from 1 to $h$, the depth of the tree. Subsequently, the classical diffusion classifier pipeline is applied to the pruned, more specific subcategories (leaf nodes), which results in faster classification overall.
  • Figure 3: Visualization of the ImageNet1K hierarchy, illustrating the first three levels of its tree structure. The categories are organized from broad entities (e.g., living and non-living things) to more specific groups (e.g., animals, objects, and transport vehicles), with the numbers in parentheses representing the total number of actual classes within each group.
  • Figure 4: Example classification of an image in the pruning stage of the HDC using Strategy 1. In this stage, the error scores of each node are used to iteratively prune the tree, narrowing it down to relevant leaf nodes that will undergo further refinement in subsequent stages. The subsequent steps then focus on closely related nodes (see leaves under the purple line), such as the American Bald Eagle and Vulture, ultimately selecting the leaf node with the lowest error score—Kite (Bird of Prey) - in the final stage.
  • Figure 5: Confusion Matrix of HDC (Strategy 1) for the sub-classes under the synset class "Animal". The x-axis shows the predicted labels (including "other classes" outside of the synset class "Animal"), and the y-axis shows the ground-truth labels.