Table of Contents
Fetching ...

Hyperparameter Optimization for Driving Strategies Based on Reinforcement Learning

Nihal Acharya Adde, Hanno Gottschalk, Andreas Ebert

TL;DR

The paper addresses hyperparameter optimization for reinforcement learning–based autonomous driving in a high-fidelity simulation. It combines Latin Hypercube Sampling for initialization, Gaussian Process surrogates, and Efficient Global Optimization with EI, extended to parallel $q$EI, to maximize cumulative rewards of a PPO-based driving agent in a Unity3D simulator. Results show a notable ~4% gain over manually tuned and initial LHS configurations, with the surrogate model’s $R^2$ improving from $0.48$ to $0.69$, and a best hyperparameter set achieving a peak reward of $1193$. Sensitivity analysis indicates learning rate as the most influential parameter, and the work demonstrates the viability of GP-based Bayesian optimization for RL in autonomous driving, outlining future directions such as multi-objective optimization and codevelopment of architectures.

Abstract

This paper focuses on hyperparameter optimization for autonomous driving strategies based on Reinforcement Learning. We provide a detailed description of training the RL agent in a simulation environment. Subsequently, we employ Efficient Global Optimization algorithm that uses Gaussian Process fitting for hyperparameter optimization in RL. Before this optimization phase, Gaussian process interpolation is applied to fit the surrogate model, for which the hyperparameter set is generated using Latin hypercube sampling. To accelerate the evaluation, parallelization techniques are employed. Following the hyperparameter optimization procedure, a set of hyperparameters is identified, resulting in a noteworthy enhancement in overall driving performance. There is a substantial increase of 4\% when compared to existing manually tuned parameters and the hyperparameters discovered during the initialization process using Latin hypercube sampling. After the optimization, we analyze the obtained results thoroughly and conduct a sensitivity analysis to assess the robustness and generalization capabilities of the learned autonomous driving strategies. The findings from this study contribute to the advancement of Gaussian process based Bayesian optimization to optimize the hyperparameters for autonomous driving in RL, providing valuable insights for the development of efficient and reliable autonomous driving systems.

Hyperparameter Optimization for Driving Strategies Based on Reinforcement Learning

TL;DR

The paper addresses hyperparameter optimization for reinforcement learning–based autonomous driving in a high-fidelity simulation. It combines Latin Hypercube Sampling for initialization, Gaussian Process surrogates, and Efficient Global Optimization with EI, extended to parallel EI, to maximize cumulative rewards of a PPO-based driving agent in a Unity3D simulator. Results show a notable ~4% gain over manually tuned and initial LHS configurations, with the surrogate model’s improving from to , and a best hyperparameter set achieving a peak reward of . Sensitivity analysis indicates learning rate as the most influential parameter, and the work demonstrates the viability of GP-based Bayesian optimization for RL in autonomous driving, outlining future directions such as multi-objective optimization and codevelopment of architectures.

Abstract

This paper focuses on hyperparameter optimization for autonomous driving strategies based on Reinforcement Learning. We provide a detailed description of training the RL agent in a simulation environment. Subsequently, we employ Efficient Global Optimization algorithm that uses Gaussian Process fitting for hyperparameter optimization in RL. Before this optimization phase, Gaussian process interpolation is applied to fit the surrogate model, for which the hyperparameter set is generated using Latin hypercube sampling. To accelerate the evaluation, parallelization techniques are employed. Following the hyperparameter optimization procedure, a set of hyperparameters is identified, resulting in a noteworthy enhancement in overall driving performance. There is a substantial increase of 4\% when compared to existing manually tuned parameters and the hyperparameters discovered during the initialization process using Latin hypercube sampling. After the optimization, we analyze the obtained results thoroughly and conduct a sensitivity analysis to assess the robustness and generalization capabilities of the learned autonomous driving strategies. The findings from this study contribute to the advancement of Gaussian process based Bayesian optimization to optimize the hyperparameters for autonomous driving in RL, providing valuable insights for the development of efficient and reliable autonomous driving systems.
Paper Structure (25 sections, 5 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 25 sections, 5 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: Framework of reinforcement learning
  • Figure 2: Unity3D simulator training environment. Top left: Drone view of road with agent. Top right: Camera recordings. Bottom: Driving behavior based on trajectory points.
  • Figure 3: PPO neural network architecture
  • Figure 4: Optimizing hyperparameters through iterative refinement: Latin Hypercube Sampling initiates the search, while Efficient Global Optimizer maximizes cumulative rewards.
  • Figure 5: Reward Convergence plots. (a)Cumulative rewards achieved during the optimization process. In the EGO phase, rewards are refined iteratively by tuning hyperparameters based on knowledge gained from previously evaluated data. (b) Reward Convergence for all RL iterations. Here, the average progression of rewards during RL training is plotted for both the initial data generation phase and the EGO phase.
  • ...and 1 more figures