Table of Contents
Fetching ...

Proximal Policy Optimization-Based Reinforcement Learning Approach for DC-DC Boost Converter Control: A Comparative Evaluation Against Traditional Control Techniques

Utsab Saha, Atik Jawad, Shakib Shahria, A. B. M Harun-Ur Rashid

TL;DR

This study tackles voltage regulation of a DC-DC boost converter under nonlinear dynamics by deploying a PPO-based reinforcement learning controller. It formulates the control problem within an actor-critic RL framework, using a clipped surrogate objective and a carefully designed reward structure to drive $V_{out}$ toward $V_{ref}$. Through MATLAB Simulink co-simulation, the PPO-based approach is compared against PI control (tuned by PSO/GA) and ANN control, showing superior stability and robustness, especially under dynamic input conditions. The findings indicate that RL-based control can enhance boost-converter performance in DC power systems, offering a promising direction for advanced intelligent control in power electronics.

Abstract

This article proposes a proximal policy optimization (PPO)-based reinforcement learning (RL) approach for DC-DC boost converter control that is compared with traditional control methods. The performance of the PPO algorithm is evaluated using MATLAB Simulink co-simulation, and the results demonstrate that the most efficient approach for achieving short settling time and stability is to combine the PPO algorithm with a reinforcement learning-based control method. The simulation results show that the control method based on RL with the PPO algorithm pro vides step response characteristics that outperform traditional control approaches, thereby enhancing DC-DC boost converter control. This research also highlights the inherent capability of the reinforcement learning method to enhance the performance of boost converter control.

Proximal Policy Optimization-Based Reinforcement Learning Approach for DC-DC Boost Converter Control: A Comparative Evaluation Against Traditional Control Techniques

TL;DR

This study tackles voltage regulation of a DC-DC boost converter under nonlinear dynamics by deploying a PPO-based reinforcement learning controller. It formulates the control problem within an actor-critic RL framework, using a clipped surrogate objective and a carefully designed reward structure to drive toward . Through MATLAB Simulink co-simulation, the PPO-based approach is compared against PI control (tuned by PSO/GA) and ANN control, showing superior stability and robustness, especially under dynamic input conditions. The findings indicate that RL-based control can enhance boost-converter performance in DC power systems, offering a promising direction for advanced intelligent control in power electronics.

Abstract

This article proposes a proximal policy optimization (PPO)-based reinforcement learning (RL) approach for DC-DC boost converter control that is compared with traditional control methods. The performance of the PPO algorithm is evaluated using MATLAB Simulink co-simulation, and the results demonstrate that the most efficient approach for achieving short settling time and stability is to combine the PPO algorithm with a reinforcement learning-based control method. The simulation results show that the control method based on RL with the PPO algorithm pro vides step response characteristics that outperform traditional control approaches, thereby enhancing DC-DC boost converter control. This research also highlights the inherent capability of the reinforcement learning method to enhance the performance of boost converter control.
Paper Structure (20 sections, 17 equations, 11 figures, 5 tables, 2 algorithms)

This paper contains 20 sections, 17 equations, 11 figures, 5 tables, 2 algorithms.

Figures (11)

  • Figure 1: Circuit diagram of DC-DC boost converter
  • Figure 2: Diagram of proposed method
  • Figure 3: Network architecture of actor-critic network
  • Figure 4: Output voltage of boost converter using proposed control method (Condition I)
  • Figure 5: Output voltage of boost converter using proposed control method (Condition II)
  • ...and 6 more figures