Dynamic Task Control Method of a Flexible Manipulator Using a Deep Recurrent Neural Network

Kento Kawaharazuka; Toru Ogawa; Cota Nabeshima

Dynamic Task Control Method of a Flexible Manipulator Using a Deep Recurrent Neural Network

Kento Kawaharazuka, Toru Ogawa, Cota Nabeshima

TL;DR

Dynamic Task Execution Network (DTXNET) addresses the challenge of controlling flexible, underactuated manipulators by learning state and task transitions from both actuator data and images, enabling real-time, torque-based control for dynamic tasks with sparse cues. The method combines a deep recurrent (LSTM) model with backpropagation through time to optimize control sequences over a horizon $T_{control}$, using a loss $L_{control}$ that emphasizes accurate task-state timing. Multiple input/output configurations are explored, with the Type 3^+ setup (Joint State + Image as input, with robot state output) delivering the best performance in predicting both motion and task signals. Experiments on a Wadaiko drumming task demonstrate improved timing and movement fidelity over random control, validating DTXNET’s ability to manage dynamic tasks for flexible robots and suggesting broad applicability to other soft-robot control problems. The approach offers a flexible, data-driven alternative to precise modeling for dynamic, sparse-event manipulation in soft robotics, with potential for extension to diverse sensors and tasks.

Abstract

The flexible body has advantages over the rigid body in terms of environmental contact thanks to its underactuation. On the other hand, when applying conventional control methods to realize dynamic tasks with the flexible body, there are two difficulties: accurate modeling of the flexible body and the derivation of intermediate postures to achieve the tasks. Learning-based methods are considered to be more effective than accurate modeling, but they require explicit intermediate postures. To solve these two difficulties at the same time, we developed a real-time task control method with a deep recurrent neural network named Dynamic Task Execution Network (DTXNET), which acquires the relationship among the control command, robot state including image information, and task state. Once the network is trained, only the target event and its timing are needed to realize a given task. To demonstrate the effectiveness of our method, we applied it to the task of Wadaiko (traditional Japanese drum) drumming as an example, and verified the best configuration of DTXNET.

Dynamic Task Control Method of a Flexible Manipulator Using a Deep Recurrent Neural Network

TL;DR

, using a loss

that emphasizes accurate task-state timing. Multiple input/output configurations are explored, with the Type 3^+ setup (Joint State + Image as input, with robot state output) delivering the best performance in predicting both motion and task signals. Experiments on a Wadaiko drumming task demonstrate improved timing and movement fidelity over random control, validating DTXNET’s ability to manage dynamic tasks for flexible robots and suggesting broad applicability to other soft-robot control problems. The approach offers a flexible, data-driven alternative to precise modeling for dynamic, sparse-event manipulation in soft robotics, with potential for extension to diverse sensors and tasks.

Abstract

Paper Structure (18 sections, 4 equations, 9 figures)

This paper contains 18 sections, 4 equations, 9 figures.

INTRODUCTION
DTXNET and Real-time Control System
Basic Structure of DTXNET
Various Configurations of DTXNET
Training Phase of DTXNET
Real-time Control Phase Using DTXNET
Implementation Details
Sound Processing
Image Processing
Detailed Network Implementation
Parameters of Training and Control Phase
Loss Calculation
Experiments
Experimental Setup
Training of DTXNET
...and 3 more sections

Figures (9)

Figure 1: Our goal: dynamic task control of the flexible body.
Figure 2: Basic structure of DTXNET and its application to a real-time control system.
Figure 3: Various configurations of DTXNET by the design of the robot state: using only Joint State (Type 1), using only Image (Type 2), or using both of them (Type 3), and by the output of the network; Type $\cdot^-$ does not output the robot state, and Type $\cdot^+$ outputs the robot state.
Figure 4: Experimental setup of Wadaiko drumming.
Figure 5: Loss transition when using the model of Type $3^+$ at training phase. $L_o$, $L_j$, $L_i$, and $L$ are the losses of task output, joint state, image, and their sum, respectively. $L^v_o$, $L^v_j$, $L^v_i$, and $L^v$ are the losses at validation phase.
...and 4 more figures

Dynamic Task Control Method of a Flexible Manipulator Using a Deep Recurrent Neural Network

TL;DR

Abstract

Dynamic Task Control Method of a Flexible Manipulator Using a Deep Recurrent Neural Network

Authors

TL;DR

Abstract

Table of Contents

Figures (9)