Table of Contents
Fetching ...

Scenario-based Thermal Management Parametrization Through Deep Reinforcement Learning

Thomas Rudolf, Philip Muhl, Sören Hohmann, Lutz Eckstein

TL;DR

This work addresses the challenge of robustly parametrizing embedded thermal-management controllers for BEVs under diverse usage. It combines scenario-based virtual development with a contextual DRL agent that treats ECU parameter maps as image-like inputs, enabling automatic tuning of a PI valve controller in a simulated TS-enriched environment. The approach uses a DroQ-based training pipeline, scenario-generated data, and an image-encoded parameter representation to achieve competitive real-world performance against expert and baseline parametrizations, demonstrated on a valve controller with tests at Nardò. The results indicate significant reductions in development time and the potential to transfer the methodology to other TM components and vehicle platforms, advancing virtual development in automotive engineering.

Abstract

The thermal system of battery electric vehicles demands advanced control. Its thermal management needs to effectively control active components across varying operating conditions. While robust control function parametrization is required, current methodologies show significant drawbacks. They consume considerable time, human effort, and extensive real-world testing. Consequently, there is a need for innovative and intelligent solutions that are capable of autonomously parametrizing embedded controllers. Addressing this issue, our paper introduces a learning-based tuning approach. We propose a methodology that benefits from automated scenario generation for increased robustness across vehicle usage scenarios. Our deep reinforcement learning agent processes the tuning task context and incorporates an image-based interpretation of embedded parameter sets. We demonstrate its applicability to a valve controller parametrization task and verify it in real-world vehicle testing. The results highlight the competitive performance to baseline methods. This novel approach contributes to the shift towards virtual development of thermal management functions, with promising potential of large-scale parameter tuning in the automotive industry.

Scenario-based Thermal Management Parametrization Through Deep Reinforcement Learning

TL;DR

This work addresses the challenge of robustly parametrizing embedded thermal-management controllers for BEVs under diverse usage. It combines scenario-based virtual development with a contextual DRL agent that treats ECU parameter maps as image-like inputs, enabling automatic tuning of a PI valve controller in a simulated TS-enriched environment. The approach uses a DroQ-based training pipeline, scenario-generated data, and an image-encoded parameter representation to achieve competitive real-world performance against expert and baseline parametrizations, demonstrated on a valve controller with tests at Nardò. The results indicate significant reductions in development time and the potential to transfer the methodology to other TM components and vehicle platforms, advancing virtual development in automotive engineering.

Abstract

The thermal system of battery electric vehicles demands advanced control. Its thermal management needs to effectively control active components across varying operating conditions. While robust control function parametrization is required, current methodologies show significant drawbacks. They consume considerable time, human effort, and extensive real-world testing. Consequently, there is a need for innovative and intelligent solutions that are capable of autonomously parametrizing embedded controllers. Addressing this issue, our paper introduces a learning-based tuning approach. We propose a methodology that benefits from automated scenario generation for increased robustness across vehicle usage scenarios. Our deep reinforcement learning agent processes the tuning task context and incorporates an image-based interpretation of embedded parameter sets. We demonstrate its applicability to a valve controller parametrization task and verify it in real-world vehicle testing. The results highlight the competitive performance to baseline methods. This novel approach contributes to the shift towards virtual development of thermal management functions, with promising potential of large-scale parameter tuning in the automotive industry.
Paper Structure (14 sections, 8 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 14 sections, 8 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: The current thermal system development process typically requires intensive test iterations in climatic conditions around the world. Our novel approach streamlines the control function parametrization task for reduced testing iterations. We leverage a diverse thermal system scenario generation that drives a learning parametrization agent.
  • Figure 2: System schematic contains a continuous rotational four-way mixing valve, a coolant pump, a fan, and a radiator shutter, as the thermal system actuators. The tractive system is the dominant heat source, while the heat exchanger represents the primary heat sink to ambient.
  • Figure 3: Closed-loop controller parametrization diagram, comprising the test-driven ECU parameter evaluation and tuning in a target application domain. A deep reinforcement learning loop interfaces the environment via the observation $\bm{o}$ of the system, adaption of the ECU parameters as actions $\bm{a}$, and rewarding feedback $r$ with respect to the testing objective $J$.
  • Figure 4: Overview of our proposed approach comprising the scenario-based generation of TSS (left), the simulation and evaluation of the TSS (center), and the architecture of the deep reinforcement learning agent (right). The observation $\bm{o}$ comprises the thermal management context information $\bm{c}$, current controller parameters $\bm{\phi}$ as image-like projection, and ECU signals. An encoder-decoder network processes the multi-modal information to predict an action $a$ to the parameters as well as Q values for the learning algorithm. Controller parameter adaptations are masked with respect to operating points.
  • Figure 5: Agent training curve depicting the rolling mean (window size 15.0 episodes) of the episode rewards and one standard deviation band. The rewards are normalized to the best experienced single scenario performance.
  • ...and 2 more figures