Predictability-Based Curiosity-Guided Action Symbol Discovery
Burcu Kilic, Alper Ahmetoglu, Emre Ugur
TL;DR
The paper tackles autonomous discovery of high-level symbolic actions and perceptual symbols to enable planning in manipulation tasks. It introduces a predictive encoder-decoder that yields discrete object and action symbols via bottleneck binarization, guided by an entropy-based curiosity signal to explore informative actions. A BFS planner operates on the learned primitives, with a parameter-distillation step to convert symbols back into executable continuous actions. Empirical results show that curiosity-driven exploration yields a richer set of action primitives and higher planning success than baselines, advancing developmental robotics toward autonomous skill discovery and generalizable planning in manipulation tasks.
Abstract
Discovering symbolic representations for skills is essential for abstract reasoning and efficient planning in robotics. Previous neuro-symbolic robotic studies mostly focused on discovering perceptual symbolic categories given a pre-defined action repertoire and generating plans with given action symbols. A truly developmental robotic system, on the other hand, should be able to discover all the abstractions required for the planning system with minimal human intervention. In this study, we propose a novel system that is designed to discover symbolic action primitives along with perceptual symbols autonomously. Our system is based on an encoder-decoder structure that takes object and action information as input and predicts the generated effect. To efficiently explore the vast continuous action parameter space, we introduce a Curiosity-Based exploration module that selects the most informative actions -- the ones that maximize the entropy in the predicted effect distribution. The discovered symbolic action primitives are then used to make plans using a symbolic tree search strategy in single- and double-object manipulation tasks. We compare our model with two baselines that use different exploration strategies in different experiments. The results show that our approach can learn a diverse set of symbolic action primitives, which are effective for generating plans in order to achieve given manipulation goals.
