Table of Contents
Fetching ...

Approximate Dynamic Programming for Degradation-aware Market Participation of Battery Energy Storage Systems: Bridging Market and Degradation Timescales

Flemming Holtorf, Sungho Shin

Abstract

We present an approximate dynamic programming framework for designing degradation-aware market participation policies for battery energy storage systems. The approach employs a tailored value function approximation that reduces the state space to state of charge and battery health, while performing dynamic programming along a pseudo-time axis encoded by state of health. This formulation enables an offline/online computation split that separates long-term degradation dynamics (months to years) from short-term market dynamics (seconds to minutes) -- a timescale mismatch that renders conventional predictive control and dynamic programming approaches computationally intractable. The main computational effort occurs offline, where the value function is approximated via coarse-grained backward induction along the health dimension. Online decisions then reduce to a real-time tractable one-step predictive control problem guided by the precomputed value function. This decoupling allows the integration of high-fidelity physics-informed degradation models without sacrificing real-time feasibility. Backtests on historical market data show that the resulting policy outperforms several benchmark strategies with optimized hyperparameters.

Approximate Dynamic Programming for Degradation-aware Market Participation of Battery Energy Storage Systems: Bridging Market and Degradation Timescales

Abstract

We present an approximate dynamic programming framework for designing degradation-aware market participation policies for battery energy storage systems. The approach employs a tailored value function approximation that reduces the state space to state of charge and battery health, while performing dynamic programming along a pseudo-time axis encoded by state of health. This formulation enables an offline/online computation split that separates long-term degradation dynamics (months to years) from short-term market dynamics (seconds to minutes) -- a timescale mismatch that renders conventional predictive control and dynamic programming approaches computationally intractable. The main computational effort occurs offline, where the value function is approximated via coarse-grained backward induction along the health dimension. Online decisions then reduce to a real-time tractable one-step predictive control problem guided by the precomputed value function. This decoupling allows the integration of high-fidelity physics-informed degradation models without sacrificing real-time feasibility. Backtests on historical market data show that the resulting policy outperforms several benchmark strategies with optimized hyperparameters.
Paper Structure (21 sections, 1 theorem, 25 equations, 4 figures, 1 algorithm)

This paper contains 21 sections, 1 theorem, 25 equations, 4 figures, 1 algorithm.

Key Result

Proposition 1

Let $(x^*,u^*,\lambda^*)$ be a primal-dual feasible point of eq:mpc corresponding to the unique global optimum. Further, assume that $(x^*, u^*, \lambda^*)$ satisfies the strong second-order sufficient condition, linear independence constraint qualification, and strict complementary slackness bertse

Figures (4)

  • Figure 1: Nested discrete time axis for market and battery/grid dynamics.
  • Figure 2: Ground truth and representative scenarios for electricity and frequency regulation prices.
  • Figure 3: Uncertainty model for frequency regulation signal. Top: Spectrum and normalized unexplained variance of low-rank approximation to the empirical autocovariance matrix of the hourly frequency regulation signals. Bottom: Hourly frequency regulation signal (gray) and representative scenarios (black).
  • Figure 4: Cumulative returns under approximate value function-informed participation policy versus degradation penalty heuristic. The lines end where the battery reaches its end of life. Left to right: closed-loop simulations with ground truth uncertainty model, compared against Heuristic \ref{['eq:heuristic_1']}; closed-loop simulations, compared against Heuristic \ref{['eq:heuristic_2']}; backtests on historical market data, compared against Heuristic \ref{['eq:heuristic_1']}; backtests on historical market data, compared against Heuristic \ref{['eq:heuristic_2']}.

Theorems & Definitions (3)

  • Remark 1
  • Proposition 1
  • proof