Predicate Hierarchies Improve Few-Shot State Classification
Emily Jin, Joy Hsu, Jiajun Wu
TL;DR
PHIER tackles few-shot state classification by encoding predicate hierarchies into a joint image-predicate latent space. It combines an object-centric scene encoder, self-supervised predicate-relational losses guided by an LLM, and a hyperbolic latent space (Poincaré ball) to capture hierarchical structure, enabling strong generalization to unseen predicates and data shifts. Empirical results on CALVIN, BEHAVIOR, and real-world transfer show PHIER outperforms supervised and large pretrained VLM baselines in out-of-distribution and zero-/few-shot settings, while maintaining competitive in-distribution performance. This approach demonstrates that leveraging predicate hierarchies and hyperbolic geometry can substantially improve robust, data-efficient state reasoning for robotic planning and manipulation.
Abstract
State classification of objects and their relations is core to many long-horizon tasks, particularly in robot planning and manipulation. However, the combinatorial explosion of possible object-predicate combinations, coupled with the need to adapt to novel real-world environments, makes it a desideratum for state classification models to generalize to novel queries with few examples. To this end, we propose PHIER, which leverages predicate hierarchies to generalize effectively in few-shot scenarios. PHIER uses an object-centric scene encoder, self-supervised losses that infer semantic relations between predicates, and a hyperbolic distance metric that captures hierarchical structure; it learns a structured latent space of image-predicate pairs that guides reasoning over state classification queries. We evaluate PHIER in the CALVIN and BEHAVIOR robotic environments and show that PHIER significantly outperforms existing methods in few-shot, out-of-distribution state classification, and demonstrates strong zero- and few-shot generalization from simulated to real-world tasks. Our results demonstrate that leveraging predicate hierarchies improves performance on state classification tasks with limited data.
