Table of Contents
Fetching ...

Accelerating Reinforcement Learning via Error-Related Human Brain Signals

Suzie Kim, Hye-Bin Shin, Hyo-Jeong Jang

TL;DR

This work tackles accelerating reinforcement learning for high-dimensional robotic manipulation by leveraging error-related potentials (ErrP) decoded from EEG. The method trains an ErrP decoder with a leave-one-subject-out scheme and maps $p_t$ to a neural feedback term $r_{hf}(t)=0.5-p_t$, combining it with the environment reward via $r_{total}(t)=r_{env}(t)+\alpha\,r_{hf}(t)$. Evaluations on a 7‑DoF robot in a cluttered robosuite Lift task show that small-to-moderate weights $\alpha$ consistently improve sample efficiency and trajectory quality, while very large weights can hinder final performance but may reduce collisions. Cross-subject experiments with twelve participants demonstrate robustness to inter-individual EEG variability, indicating a scalable pathway for human-aligned manipulation skill acquisition.

Abstract

In this work, we investigate how implicit neural feed back can accelerate reinforcement learning in complex robotic manipulation settings. While prior electroencephalogram (EEG) guided reinforcement learning studies have primarily focused on navigation or low-dimensional locomotion tasks, we aim to understand whether such neural evaluative signals can improve policy learning in high-dimensional manipulation tasks involving obstacles and precise end-effector control. We integrate error related potentials decoded from offline-trained EEG classifiers into reward shaping and systematically evaluate the impact of human-feedback weighting. Experiments on a 7-DoF manipulator in an obstacle-rich reaching environment show that neural feedback accelerates reinforcement learning and, depending on the human-feedback weighting, can yield task success rates that at times exceed those of sparse-reward baselines. Moreover, when applying the best-performing feedback weighting across all sub jects, we observe consistent acceleration of reinforcement learning relative to the sparse-reward setting. Furthermore, leave-one subject-out evaluations confirm that the proposed framework remains robust despite the intrinsic inter-individual variability in EEG decodability. Our findings demonstrate that EEG-based reinforcement learning can scale beyond locomotion tasks and provide a viable pathway for human-aligned manipulation skill acquisition.

Accelerating Reinforcement Learning via Error-Related Human Brain Signals

TL;DR

This work tackles accelerating reinforcement learning for high-dimensional robotic manipulation by leveraging error-related potentials (ErrP) decoded from EEG. The method trains an ErrP decoder with a leave-one-subject-out scheme and maps to a neural feedback term , combining it with the environment reward via . Evaluations on a 7‑DoF robot in a cluttered robosuite Lift task show that small-to-moderate weights consistently improve sample efficiency and trajectory quality, while very large weights can hinder final performance but may reduce collisions. Cross-subject experiments with twelve participants demonstrate robustness to inter-individual EEG variability, indicating a scalable pathway for human-aligned manipulation skill acquisition.

Abstract

In this work, we investigate how implicit neural feed back can accelerate reinforcement learning in complex robotic manipulation settings. While prior electroencephalogram (EEG) guided reinforcement learning studies have primarily focused on navigation or low-dimensional locomotion tasks, we aim to understand whether such neural evaluative signals can improve policy learning in high-dimensional manipulation tasks involving obstacles and precise end-effector control. We integrate error related potentials decoded from offline-trained EEG classifiers into reward shaping and systematically evaluate the impact of human-feedback weighting. Experiments on a 7-DoF manipulator in an obstacle-rich reaching environment show that neural feedback accelerates reinforcement learning and, depending on the human-feedback weighting, can yield task success rates that at times exceed those of sparse-reward baselines. Moreover, when applying the best-performing feedback weighting across all sub jects, we observe consistent acceleration of reinforcement learning relative to the sparse-reward setting. Furthermore, leave-one subject-out evaluations confirm that the proposed framework remains robust despite the intrinsic inter-individual variability in EEG decodability. Our findings demonstrate that EEG-based reinforcement learning can scale beyond locomotion tasks and provide a viable pathway for human-aligned manipulation skill acquisition.

Paper Structure

This paper contains 10 sections, 3 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Effect of human-feedback weighting on learning performance. Episodic return curves for sparse RL and RLIHF across different feedback weights ($\alpha$). Moderate weights ($\alpha = 0.1\!-\!0.3$) consistently accelerate learning, while excessively large weights reduce final performance.
  • Figure 2: Cross-subject RLIHF evaluation across 12 participants. For nearly all subjects, RLIHF accelerates learning relative to the sparse-reward baseline, demonstrating robustness to individual variability in EEG decoder accuracy.