Table of Contents
Fetching ...

Autonomous Identification and Goal-Directed Invocation of Event-Predictive Behavioral Primitives

Christian Gumbsch, Martin V. Butz, Georg Martius

TL;DR

A computational learning architecture, termed as surprise-based behavioral modularization into event-predictive structures (SUBMODES) that explores behavior and identifies the underlying behavioral units completely from scratch and can use its learned predictive models progressively more effectively for invoking model predictive planning and goal-directed control in different tasks and environments.

Abstract

Voluntary behavior of humans appears to be composed of small, elementary building blocks or behavioral primitives. While this modular organization seems crucial for the learning of complex motor skills and the flexible adaption of behavior to new circumstances, the problem of learning meaningful, compositional abstractions from sensorimotor experiences remains an open challenge. Here, we introduce a computational learning architecture, termed surprise-based behavioral modularization into event-predictive structures (SUBMODES), that explores behavior and identifies the underlying behavioral units completely from scratch. The SUBMODES architecture bootstraps sensorimotor exploration using a self-organizing neural controller. While exploring the behavioral capabilities of its own body, the system learns modular structures that predict the sensorimotor dynamics and generate the associated behavior. In line with recent theories of event perception, the system uses unexpected prediction error signals, i.e., surprise, to detect transitions between successive behavioral primitives. We show that, when applied to two robotic systems with completely different body kinematics, the system manages to learn a variety of complex and realistic behavioral primitives. Moreover, after initial self-exploration the system can use its learned predictive models progressively more effectively for invoking model predictive planning and goal-directed control in different tasks and environments.

Autonomous Identification and Goal-Directed Invocation of Event-Predictive Behavioral Primitives

TL;DR

A computational learning architecture, termed as surprise-based behavioral modularization into event-predictive structures (SUBMODES) that explores behavior and identifies the underlying behavioral units completely from scratch and can use its learned predictive models progressively more effectively for invoking model predictive planning and goal-directed control in different tasks and environments.

Abstract

Voluntary behavior of humans appears to be composed of small, elementary building blocks or behavioral primitives. While this modular organization seems crucial for the learning of complex motor skills and the flexible adaption of behavior to new circumstances, the problem of learning meaningful, compositional abstractions from sensorimotor experiences remains an open challenge. Here, we introduce a computational learning architecture, termed surprise-based behavioral modularization into event-predictive structures (SUBMODES), that explores behavior and identifies the underlying behavioral units completely from scratch. The SUBMODES architecture bootstraps sensorimotor exploration using a self-organizing neural controller. While exploring the behavioral capabilities of its own body, the system learns modular structures that predict the sensorimotor dynamics and generate the associated behavior. In line with recent theories of event perception, the system uses unexpected prediction error signals, i.e., surprise, to detect transitions between successive behavioral primitives. We show that, when applied to two robotic systems with completely different body kinematics, the system manages to learn a variety of complex and realistic behavioral primitives. Moreover, after initial self-exploration the system can use its learned predictive models progressively more effectively for invoking model predictive planning and goal-directed control in different tasks and environments.

Paper Structure

This paper contains 18 sections, 12 equations, 12 figures, 5 algorithms.

Figures (12)

  • Figure 1: Illustration of the SUBMODES architecture during the learning of behavior. An explorative controller generates motor commands based on the current proprioceptive input to explore self-organizing behavior. One of multiple, internal behavioral models attempts to predict the motor commands and sensory consequences of the ongoing behavior. The predicted sensorimotor state is compared to the actual state to compute the prediction error and update the active behavioral model. For each behavioral model an error model is trained, estimating the prediction confidence. If surprise is detected, i.e., a strong error signal outside the usual prediction confidence, the system is allowed to exchange the active behavioral model. For each transition between two different behavioral models a transition model is learned. During goal-directed control, the explorative controller is deactivated and the active behavioral model determines the next action (dashed line).
  • Figure 2: Spherical robot and its axis orientation sensors. (a) shows a screenshot from simulation. (b) shows a schematic illustration of how the axis orientation sensor values $x_i$ are determined (taken from martiusPhd)
  • Figure 3: Exemplary surprise detection for the Spherical robot and the Hexapod shown through the development of the internal error statistics over time. The plots show the current prediction error ($e(t)$), the mean prediction error of the active model ($\bar{e}_i(t)$) and the confidence of the active model ($\bar{e}_i(t) + \theta \bar{\sigma}_i(t)$) over time. Marks along the x-axis denote 10 second intervals. The pictures show the surprise detection in simulation. The fourth frame depicts the time step when surprise was detected. The inter frame interval is approximately 0.5 seconds. See the text for qualitative descriptions of the changes in behavior.
  • Figure 4: Behavioral space of the Spherical robot discovered by the SUBMODES architecture in one simulation. (a) illustrates the angular velocity $\omega_i$ around the internal axes. Each point in (b)-(d) shows the behavior of the robot in terms of angular velocities $\omega_i$ at that time. (b) shows the behavior for rolling in an approximate straight line, i.e., with changes in driving direction $|\dot{\alpha}| < 0.3^\circ$. (c) shows the behavior for turning left ($\dot{\alpha} > 0.3^\circ$) and (d) shows the behavior for turning right ($\dot{\alpha} < -0.3^\circ$). The color of each point depicts which behavioral model $B_i$ was active and predicting the behavior at this time. For clarity only every 50th time step of the simulation is shown.
  • Figure 5: Exemplary gaits discovered by the SUBMODES system for the Hexapod. Each gait was encoded by a single behavioral model $B_i$. (a)-(d) show gaits in an open field. (e) and (f) show gaits in different terrains (see section \ref{['sectionObstacle']}). In (e) snow slows down leg movements within it. In (f) a low ceiling limits the upward movement range of the legs. The inter-frame interval for the shown images is approximately 0.2 seconds.
  • ...and 7 more figures