Deep Taxonomic Networks for Unsupervised Hierarchical Prototype Discovery
Zekun Wang, Ethan Haarer, Tianyi Zhu, Zhiyi Dai, Christopher J. MacLellan
TL;DR
This work introduces Deep Taxonomic Networks (DTN), a deep latent-variable framework that learns an unlabeled, multilevel taxonomy by maximizing a complete binary-tree Mixture-of-Gaussians prior within a VAE. By formulating the ELBO with a hierarchical prior and a probabilistic, prototypicality-driven objective, DTN discovers interpretable prototypes at all levels, not just leaves, and supports flexible downstream classification without retraining. It integrates transformation-invariant learning via contrastive losses and demonstrates strong hierarchical clustering performance across MNIST, Fashion-MNIST, CIFAR-10/20/100, and Omniglot, while providing qualitative hierarchies that capture coarse-to-fine visual semantics. The results highlight the method’s ability to produce rich, human-interpretable taxonomies and suggest avenues for dynamic priors and broader generative capabilities, albeit with limitations tied to the fixed binary-tree structure and potential scalability considerations.
Abstract
Inspired by the human ability to learn and organize knowledge into hierarchical taxonomies with prototypes, this paper addresses key limitations in current deep hierarchical clustering methods. Existing methods often tie the structure to the number of classes and underutilize the rich prototype information available at intermediate hierarchical levels. We introduce deep taxonomic networks, a novel deep latent variable approach designed to bridge these gaps. Our method optimizes a large latent taxonomic hierarchy, specifically a complete binary tree structured mixture-of-Gaussian prior within a variational inference framework, to automatically discover taxonomic structures and associated prototype clusters directly from unlabeled data without assuming true label sizes. We analytically show that optimizing the ELBO of our method encourages the discovery of hierarchical relationships among prototypes. Empirically, our learned models demonstrate strong hierarchical clustering performance, outperforming baselines across diverse image classification datasets using our novel evaluation mechanism that leverages prototype clusters discovered at all hierarchical levels. Qualitative results further reveal that deep taxonomic networks discover rich and interpretable hierarchical taxonomies, capturing both coarse-grained semantic categories and fine-grained visual distinctions.
