A Rational Model of Dimension-reduced Human Categorization
Yifan Hong, Chen Wang
TL;DR
This work introduces a dimension-reduced rational framework for human categorization by modeling each category with a prototype and a low-dimensional subspace via a mixture of probabilistic PPCA (mPPCA). It combines a two-level nonparametric prior (CRP/DP) to share principal components across categories, enabling principled few-shot generalization to new categories and subcategories. Theoretical results establish when discarding a PC improves discriminability, and simulations show that dimension reduction can improve or degrade performance depending on information distribution and noise. Empirical validation on CIFAR-10H demonstrates that a single PC per category captures human categorization and correlates with human choices better than full-rank or baseline models, while artificial few-shot experiments reveal context-sensitive generalization consistent with human data. Overall, mPPCA provides a flexible, interpretable account of human categorization that handles high-dimensional stimuli and rapid generalization through dimension-aware representations.
Abstract
Humans can categorize with only a few samples despite the numerous features. To mimic this ability, we propose a novel dimension-reduced category representation using a mixture of probabilistic principal component analyzers (mPPCA). Tests on the ${\tt CIFAR-10H}$ dataset demonstrate that mPPCA with only a single principal component for each category effectively predicts human categorization of natural images. We further impose a hierarchical prior on mPPCA to account for new category generalization. mPPCA captures human behavior in our experiments on images with simple size-color combinations. We also provide sufficient and necessary conditions when reducing dimensions in categorization is rational.
