Table of Contents
Fetching ...

Baby Sophia: A Developmental Approach to Self-Exploration through Self-Touch and Hand Regard

Stelios Zarifis, Ioannis Chalkiadakis, Artemis Chardouveli, Vasiliki Moutzouri, Aggelos Sotirchos, Katerina Papadimitriou, Panagiotis Filntisis, Niki Efthymiou, Petros Maragos, Katerina Pastra

TL;DR

Baby Sophia demonstrates that purely intrinsic rewards can drive developmental-like, multimodal sensorimotor learning in an embodied agent. The approach combines a semantic tactile body map and a multi-component reward system with a two-stage curriculum to enable self-touch, while a motor-babbling–driven visual discovery pipeline learns hand appearance and gaze tracking under a staged curriculum. Results show strong self-touch acquisition with extensive body coverage and balanced bilateral exploration, but hand regard reveals a symmetry bias with unilateral fixation, highlighting challenges in bilateral coordination under purely curiosity-based signals. The study underscores the potential of self-supervised, curriculum-guided exploration for embodied AI and points to future work on improving symmetry and predictive visual models.

Abstract

Inspired by infant development, we propose a Reinforcement Learning (RL) framework for autonomous self-exploration in a robotic agent, Baby Sophia, using the BabyBench simulation environment. The agent learns self-touch and hand regard behaviors through intrinsic rewards that mimic an infant's curiosity-driven exploration of its own body. For self-touch, high-dimensional tactile inputs are transformed into compact, meaningful representations, enabling efficient learning. The agent then discovers new tactile contacts through intrinsic rewards and curriculum learning that encourage broad body coverage, balance, and generalization. For hand regard, visual features of the hands, such as skin-color and shape, are learned through motor babbling. Then, intrinsic rewards encourage the agent to perform novel hand motions, and follow its hands with its gaze. A curriculum learning setup from single-hand to dual-hand training allows the agent to reach complex visual-motor coordination. The results of this work demonstrate that purely curiosity-based signals, with no external supervision, can drive coordinated multimodal learning, imitating an infant's progression from random motor babbling to purposeful behaviors.

Baby Sophia: A Developmental Approach to Self-Exploration through Self-Touch and Hand Regard

TL;DR

Baby Sophia demonstrates that purely intrinsic rewards can drive developmental-like, multimodal sensorimotor learning in an embodied agent. The approach combines a semantic tactile body map and a multi-component reward system with a two-stage curriculum to enable self-touch, while a motor-babbling–driven visual discovery pipeline learns hand appearance and gaze tracking under a staged curriculum. Results show strong self-touch acquisition with extensive body coverage and balanced bilateral exploration, but hand regard reveals a symmetry bias with unilateral fixation, highlighting challenges in bilateral coordination under purely curiosity-based signals. The study underscores the potential of self-supervised, curriculum-guided exploration for embodied AI and points to future work on improving symmetry and predictive visual models.

Abstract

Inspired by infant development, we propose a Reinforcement Learning (RL) framework for autonomous self-exploration in a robotic agent, Baby Sophia, using the BabyBench simulation environment. The agent learns self-touch and hand regard behaviors through intrinsic rewards that mimic an infant's curiosity-driven exploration of its own body. For self-touch, high-dimensional tactile inputs are transformed into compact, meaningful representations, enabling efficient learning. The agent then discovers new tactile contacts through intrinsic rewards and curriculum learning that encourage broad body coverage, balance, and generalization. For hand regard, visual features of the hands, such as skin-color and shape, are learned through motor babbling. Then, intrinsic rewards encourage the agent to perform novel hand motions, and follow its hands with its gaze. A curriculum learning setup from single-hand to dual-hand training allows the agent to reach complex visual-motor coordination. The results of this work demonstrate that purely curiosity-based signals, with no external supervision, can drive coordinated multimodal learning, imitating an infant's progression from random motor babbling to purposeful behaviors.

Paper Structure

This paper contains 25 sections, 7 equations, 3 tables.