Table of Contents
Fetching ...

ε-Neural Thompson Sampling of Deep Brain Stimulation for Parkinson Disease Treatment

Hao-Lun Hsu, Qitong Gao, Miroslav Pajic

TL;DR

This work reframes adaptive deep brain stimulation for Parkinson's disease as a contextual multi-armed bandit problem, using beta-band power $P_eta$ as the context and discretized stimulation frequencies as arms up to $180$ Hz. It introduces ε-NeuralTS, a neural-network–based Thompson sampling method with an ε-greedy exploration policy to improve sample efficiency and reduce computation for real-time embedded DBS. The method is evaluated on a computational Basal Ganglia Model, demonstrating superior beta-band suppression $P_eta$ and lower Error Index $EI$ relative to fixed-frequency DBS and various CMAB baselines, while achieving comparable energy usage to periodic stimulation. The results suggest that context-aware, probabilistic exploration can yield clinically meaningful improvements and pave the way for practical, low-resource adaptive DBS systems, with future work including mapping $P_eta$ to $EI$ and validation on real patient data.

Abstract

Deep Brain Stimulation (DBS) stands as an effective intervention for alleviating the motor symptoms of Parkinson's disease (PD). Traditional commercial DBS devices are only able to deliver fixed-frequency periodic pulses to the basal ganglia (BG) regions of the brain, i.e., continuous DBS (cDBS). However, they in general suffer from energy inefficiency and side effects, such as speech impairment. Recent research has focused on adaptive DBS (aDBS) to resolve the limitations of cDBS. Specifically, reinforcement learning (RL) based approaches have been developed to adapt the frequencies of the stimuli in order to achieve both energy efficiency and treatment efficacy. However, RL approaches in general require significant amount of training data and computational resources, making it intractable to integrate RL policies into real-time embedded systems as needed in aDBS. In contrast, contextual multi-armed bandits (CMAB) in general lead to better sample efficiency compared to RL. In this study, we propose a CMAB solution for aDBS. Specifically, we define the context as the signals capturing irregular neuronal firing activities in the BG regions (i.e., beta-band power spectral density), while each arm signifies the (discretized) pulse frequency of the stimulation. Moreover, an ε-exploring strategy is introduced on top of the classic Thompson sampling method, leading to an algorithm called ε-Neural Thompson sampling (ε-NeuralTS), such that the learned CMAB policy can better balance exploration and exploitation of the BG environment. The ε-NeuralTS algorithm is evaluated using a computation BG model that captures the neuronal activities in PD patients' brains. The results show that our method outperforms both existing cDBS methods and CMAB baselines.

ε-Neural Thompson Sampling of Deep Brain Stimulation for Parkinson Disease Treatment

TL;DR

This work reframes adaptive deep brain stimulation for Parkinson's disease as a contextual multi-armed bandit problem, using beta-band power as the context and discretized stimulation frequencies as arms up to Hz. It introduces ε-NeuralTS, a neural-network–based Thompson sampling method with an ε-greedy exploration policy to improve sample efficiency and reduce computation for real-time embedded DBS. The method is evaluated on a computational Basal Ganglia Model, demonstrating superior beta-band suppression and lower Error Index relative to fixed-frequency DBS and various CMAB baselines, while achieving comparable energy usage to periodic stimulation. The results suggest that context-aware, probabilistic exploration can yield clinically meaningful improvements and pave the way for practical, low-resource adaptive DBS systems, with future work including mapping to and validation on real patient data.

Abstract

Deep Brain Stimulation (DBS) stands as an effective intervention for alleviating the motor symptoms of Parkinson's disease (PD). Traditional commercial DBS devices are only able to deliver fixed-frequency periodic pulses to the basal ganglia (BG) regions of the brain, i.e., continuous DBS (cDBS). However, they in general suffer from energy inefficiency and side effects, such as speech impairment. Recent research has focused on adaptive DBS (aDBS) to resolve the limitations of cDBS. Specifically, reinforcement learning (RL) based approaches have been developed to adapt the frequencies of the stimuli in order to achieve both energy efficiency and treatment efficacy. However, RL approaches in general require significant amount of training data and computational resources, making it intractable to integrate RL policies into real-time embedded systems as needed in aDBS. In contrast, contextual multi-armed bandits (CMAB) in general lead to better sample efficiency compared to RL. In this study, we propose a CMAB solution for aDBS. Specifically, we define the context as the signals capturing irregular neuronal firing activities in the BG regions (i.e., beta-band power spectral density), while each arm signifies the (discretized) pulse frequency of the stimulation. Moreover, an ε-exploring strategy is introduced on top of the classic Thompson sampling method, leading to an algorithm called ε-Neural Thompson sampling (ε-NeuralTS), such that the learned CMAB policy can better balance exploration and exploitation of the BG environment. The ε-NeuralTS algorithm is evaluated using a computation BG model that captures the neuronal activities in PD patients' brains. The results show that our method outperforms both existing cDBS methods and CMAB baselines.
Paper Structure (20 sections, 15 equations, 12 figures, 2 tables, 1 algorithm)

This paper contains 20 sections, 15 equations, 12 figures, 2 tables, 1 algorithm.

Figures (12)

  • Figure 1: Deep brain stimulation: the implantable pulse generator is placed in the patient’s chest; electrodes that can record local field potentials (LFPs) and deliver stimulation are positioned in the basal ganglia (BG) to stimulate the subthalamic nucleus or the internal segment of the globus pallidus (GPi).
  • Figure 2: An illustration of the computational brain model. The DBS stimulation is deployed to the subthalamic nucleus (STN), propagating to the other sub-regions. Error index (EI) is computed with the activations passing from sensorimotor cortex (SMC) to thalamus (TH).
  • Figure 3: Correlation between two QoC (i.e., $P_\beta$ and EI) with Pearson’s Correlation Coefficient: $0.866$.
  • Figure 4: Learning curve with different penalty coefficients using NeuralTS (lower EI is better).
  • Figure 5: Task Performance for $\epsilon$-NeuralTS with different $\epsilon$ averaged over $10$ seeds. Shaded areas denote the standard error: (a) task reward (higher is better), (b) cumulative regret (lower is better).
  • ...and 7 more figures

Theorems & Definitions (1)

  • Definition 1: Regret