Neural Control and Learning of Simulated Hand Movements With an EMG-Based Closed-Loop Interface

Balint K. Hodossy; Dario Farina

Neural Control and Learning of Simulated Hand Movements With an EMG-Based Closed-Loop Interface

Balint K. Hodossy, Dario Farina

TL;DR

This study presents an in silico neuromechanical model that combines a fully forward musculoskeletal simulation, reinforcement learning, and sequential, online electromyography synthesis that provides not only synchronised kinematics, dynamics, and corresponding neural activity, but also explicitly models feedback and feedforward control in a virtual participant.

Abstract

The standard engineering approach when facing uncertainty is modelling. Mixing data from a well-calibrated model with real recordings has led to breakthroughs in many applications of AI, from computer vision to autonomous driving. This type of model-based data augmentation is now beginning to show promising results in biosignal processing as well. However, while these simulated data are necessary, they are not sufficient for virtual neurophysiological experiments. Simply generating neural signals that reproduce a predetermined motor behaviour does not capture the flexibility, variability, and causal structure required to probe neural mechanisms during control tasks. In this study, we present an in silico neuromechanical model that combines a fully forward musculoskeletal simulation, reinforcement learning, and sequential, online electromyography synthesis. This framework provides not only synchronised kinematics, dynamics, and corresponding neural activity, but also explicitly models feedback and feedforward control in a virtual participant. In this way, online control problems can be represented, as the simulated human adapts its behaviour via a learned RL policy in response to a neural interface. For example, the virtual user can learn hand movements robust to perturbations or the control of a virtual gesture decoder. We illustrate the approach using a gesturing task within a biomechanical hand model, and lay the groundwork for using this technique to evaluate neural controllers, augment training datasets, and generate synthetic data for neurological conditions.

Neural Control and Learning of Simulated Hand Movements With an EMG-Based Closed-Loop Interface

TL;DR

Abstract

Paper Structure (25 sections, 7 equations, 7 figures, 2 tables)

This paper contains 25 sections, 7 equations, 7 figures, 2 tables.

Introduction
Background
Electromyography
Musculoskeletal Simulation
Reinforcement Learning
Contributions
Methods
MJX hand environment
Reinforcement Learning Policy
First phase: Gesturing task
Second phase: EMG-based control task
Results
Simulation performance
Learning Gesturing
EMG decoder and adaptation
...and 10 more sections

Figures (7)

Figure 1: Virtual user in closed-loop HMI control task. Observations are regressed to muscle excitation by the online control policy, which is then associated with a pool of with randomised properties. The generated spike trains are convolved with a static set of corresponding to generate the synthetic . We show a 20-s interval, during which the agent cycles through the randomised gesturing pattern 4 times, each finger being flexed and relaxed at different frequencies. For the sake of clarity we visualise only one finger’s flexion and extension patterns indicated by red shaded areas. We highlight the FDP muscle’s corresponding compartment’s activity, the envelope of which is intuitively correlated with the motion.
Figure 2: Performance measures of parallelization with different numbers of environments using two different backends, the generic friendly backend (JAX) and the NVIDIA hardware-specific spatial computing toolkit, Warp. These tests were performed with 1024 step long rollouts for all environments on a single NVIDIA L40S system with an Intel i7 processor. a): Comparing the speedup gained with parallelization using the JAX and Warp frameworks, with respect to the un-parallelised environments built with the corresponding backend. The deviation from the ideal speedup is partially due to Amdahl's law, and partially due to reaching memory and processing bottlenecks. b): The ratio of time required to perform the same rollout with JAX or Warp. Particularly for larger environment counts parallelizing the simulation with Warp has increasing benefits.
Figure 3: In the first phase of , a control policy is learnt to follow arbitrary gesture patterns. The synthetic signals from this agent are used to train a virtual subject-specific decoder offline using supervised learning. In the second phase, the agent adapts the motor behaviour to improve the performance with the fixed-parameter decoder, instead of still pursuing an imitation learning goal.
Figure 4: Illustration of the experience collection with parallel environments. Each environment is using a different projection of the reference motion templates shown in Figure \ref{['fig:motion']}, with different motion frequencies assigned to each finger.
Figure 5: The reference motion for the motion-tracking gesturing task. The templates are shared among all environments, each environment assigns each template to a randomly sampled finger's flexion joints. The template is then scaled and offset as needed according to each respective joint's range of motion. In blue is the purely sinusoidal templates shown, in orange is the discretised then low-pass filtered variant shown. The frequencies are {0.1, 0.2, 0.3, 0.4, and 0.5} Hz.
...and 2 more figures

Neural Control and Learning of Simulated Hand Movements With an EMG-Based Closed-Loop Interface

TL;DR

Abstract

Neural Control and Learning of Simulated Hand Movements With an EMG-Based Closed-Loop Interface

Authors

TL;DR

Abstract

Table of Contents

Figures (7)