Evolving LLM-Derived Control Policies for Residential EV Charging and Vehicle-to-Grid Energy Optimization
Vishesh Purnananda, Benjamin John Wruck, Mingyu Guo
TL;DR
This paper tackles the interpretability barrier of data-driven V2G control by using Large Language Models to synthesize explicit, auditable Python policies. These policies are evolved through a six-stage, simulation-driven loop in the EV2Gym-Residential environment, enabling profit optimization while respecting SoC and safety constraints. The authors compare four prompting strategies and show that a Hybrid approach yields 118% of a human baseline’s profit with concise, readable code, while a Runtime LLM agent achieves up to 190% but with substantially higher cost and latency. The work demonstrates that code-as-policies, grounded in high-fidelity simulation and regulatory-minded guardrails, can deliver transparent, deployable residential V2G controllers with practical impact for grid reliability and consumer trust.
Abstract
This research presents a novel application of Evolutionary Computation to the domain of residential electric vehicle (EV) energy management. While reinforcement learning (RL) achieves high performance in vehicle-to-grid (V2G) optimization, it typically produces opaque "black-box" neural networks that are difficult for consumers and regulators to audit. Addressing this interpretability gap, we propose a program search framework that leverages Large Language Models (LLMs) as intelligent mutation operators within an iterative prompt-evaluation-repair loop. Utilizing the high-fidelity EV2Gym simulation environment as a fitness function, the system undergoes successive refinement cycles to synthesize executable Python policies that balance profit maximization, user comfort, and physical safety constraints. We benchmark four prompting strategies: Imitation, Reasoning, Hybrid and Runtime, evaluating their ability to discover adaptive control logic. Results demonstrate that the Hybrid strategy produces concise, human-readable heuristics that achieve 118% of the baseline profit, effectively discovering complex behaviors like anticipatory arbitrage and hysteresis without explicit programming. This work establishes LLM-driven Evolutionary Computation as a practical approach for generating EV charging control policies that are transparent, inspectable, and suitable for real residential deployment.
