Table of Contents
Fetching ...

Deep Predictive Model Learning with Parametric Bias: Handling Modeling Difficulties and Temporal Model Changes

Kento Kawaharazuka, Kei Okada, Masayuki Inaba

TL;DR

This work introduces Deep Predictive Model with Parametric Bias (DPMPB), a unified framework that embeds time-varying dynamics into a low-dimensional parametric bias to cope with modeling difficulties and temporal changes in robot control. DPMPB supports both state-transition (STM) and control-transition (CTM) forms and learns dynamics via joint optimization of network weights and state-specific PBs, with online PB updates that keep the model aligned to the current body, tools, and environment. Anomaly detection is performed through prediction-error statistics, enabling detection of unexpected changes such as new grasped objects or environment conditions. Across diverse experiments, including flexible hands, low-rigidity robots, floor changes, motion-style imitation, shoe changes, and cloth manipulation, DPMPB demonstrates robust adaptation and the ability to recognize unseen dynamics through PB space organization, highlighting its potential for real-world adaptive robotics.

Abstract

When a robot executes a task, it is necessary to model the relationship among its body, target objects, tools, and environment, and to control its body to realize the target state. However, it is difficult to model them using classical methods if the relationship is complex. In addition, when the relationship changes with time, it is necessary to deal with the temporal changes of the model. In this study, we have developed Deep Predictive Model with Parametric Bias (DPMPB) as a more human-like adaptive intelligence to deal with these modeling difficulties and temporal model changes. We categorize and summarize the theory of DPMPB and various task experiments on the actual robots, and discuss the effectiveness of DPMPB.

Deep Predictive Model Learning with Parametric Bias: Handling Modeling Difficulties and Temporal Model Changes

TL;DR

This work introduces Deep Predictive Model with Parametric Bias (DPMPB), a unified framework that embeds time-varying dynamics into a low-dimensional parametric bias to cope with modeling difficulties and temporal changes in robot control. DPMPB supports both state-transition (STM) and control-transition (CTM) forms and learns dynamics via joint optimization of network weights and state-specific PBs, with online PB updates that keep the model aligned to the current body, tools, and environment. Anomaly detection is performed through prediction-error statistics, enabling detection of unexpected changes such as new grasped objects or environment conditions. Across diverse experiments, including flexible hands, low-rigidity robots, floor changes, motion-style imitation, shoe changes, and cloth manipulation, DPMPB demonstrates robust adaptation and the ability to recognize unseen dynamics through PB space organization, highlighting its potential for real-world adaptive robotics.

Abstract

When a robot executes a task, it is necessary to model the relationship among its body, target objects, tools, and environment, and to control its body to realize the target state. However, it is difficult to model them using classical methods if the relationship is complex. In addition, when the relationship changes with time, it is necessary to deal with the temporal changes of the model. In this study, we have developed Deep Predictive Model with Parametric Bias (DPMPB) as a more human-like adaptive intelligence to deal with these modeling difficulties and temporal model changes. We categorize and summarize the theory of DPMPB and various task experiments on the actual robots, and discuss the effectiveness of DPMPB.
Paper Structure (22 sections, 8 equations, 10 figures)

This paper contains 22 sections, 8 equations, 10 figures.

Figures (10)

  • Figure 1: The developed deep predictive model with parametric bias (DPMPB) can handle various modeling difficulties and temporal model changes.
  • Figure 2: The classification of predictive models, modeling difficulties, and temporal model changes. The predictive models are classified by the network input and output of the values of sensors $\bm{s}$ and actuators $\bm{u}$. The modeling difficulties are classified by the relationship among robot behavior, body, object/tool, and environment. The temporal model changes are classified by the network structure of CTM or STM and by robot behavior, body, object/tool, and environment.
  • Figure 3: The system overview of deep predictive model with parametric bias (DPMPB). DPMPB has the network input of robot/object state $\bm{s}_{t}$, control command $\bm{u}_{t}$, and parametric bias $\bm{p}$, and the network output of $\bm{s}_{t+1}$ and $\bm{u}_{t+1}$ depending on the network structure of state transition model (STM) or control transition model (CTM). Controller, Anomaly Detector, Data Collector, Network Trainer, and Online PB Updater of various robots can be executed through DPMPB by only changing the network input/output and a few parameters.
  • Figure 4: The detailed implementation of deep predictive model with parametric bias (DPMPB). The model file includes the information of network weight $\bm{W}$, trained parametric bias (PB) $\bm{p}_{k}$ and class, current PB $\bm{p}$, and the average $\mu$ and variance $\Sigma$ of the prediction error calculated at training. The configuration file includes input/output topics of Robot Operating System (ROS), the dimension of PB and hidden unit of DPMPB, and the model file name. The image is compressed by AutoEncoder and Data Collector gathers and sends all data to each component of Network Trainer, Online PB Updater, Controller, and Anomaly Detector. Network Trainer updates $\bm{W}$ and $\bm{p}_{k}$, and Online PB Updater updates $\bm{p}$.
  • Figure 5: Experiment of grasping object recognition and contact control of the flexible hand kawaharazuka2020dynamics. (a) shows the sensors and actuators of the flexible musculoskeletal hand, (b) shows the grasped objects and tools, (c) shows the trained parametric bias and its trajectory when conducting online update of PB, (d) shows the transition of $L$ when using or not using a grasping stabilization control, and (e) shows the transition of $d$ for contact detection when grasping various objects.
  • ...and 5 more figures