Unsupervised, Bottom-up Category Discovery for Symbol Grounding with a Curious Robot
Catherine Henry, Casey Kennington
TL;DR
This work tackles the Symbol Grounding Problem by enabling a curious robot to autonomously build unlabeled categories from grounded sensorimotor experience in a bottom-up fashion. It integrates an curiosity-driven learning framework (Explauto) with robot perception pipelines (YOLO+CLIP and SAM+DINOv2) to carve the perceptual space into regions that can later be grounded to words using Words-as-Classifiers. The study demonstrates that high-dimensional visual representations, coupled with cosine-based region splitting, yield more precise category boundaries aligned with objects, and that symbol grounding via WAC is feasible for these emergent categories. The approach advances embodied language learning by reducing manual labeling and showing a pathway toward grounding discovered categories to symbolic labels in robotic systems, albeit with limitations in segmentation noise and movement drift for real-world deployment.
Abstract
Towards addressing the Symbol Grounding Problem and motivated by early childhood language development, we leverage a robot which has been equipped with an approximate model of curiosity with particular focus on bottom-up building of unsupervised categories grounded in the physical world. That is, rather than starting with a top-down symbol (e.g., a word referring to an object) and providing meaning through the application of predetermined samples, the robot autonomously and gradually breaks up its exploration space into a series of increasingly specific unlabeled categories at which point an external expert may optionally provide a symbol association. We extend prior work by using a robot that can observe the visual world, introducing a higher dimensional sensory space, and using a more generalizable method of category building. Our experiments show that the robot learns categories based on actions and what it visually observes, and that those categories can be symbolically grounded into.https://info.arxiv.org/help/prep#comments
