Table of Contents
Fetching ...

Dynamic Incentive Selection for Hierarchical Convex Model Predictive Control

Akshay Thirugnanam, Koushil Sreenath

TL;DR

This work addresses the design of incentives in hierarchical MPC where an upper-level BiMPC leads a population of lower-level LoMPCs by providing incentives that steer their optimal inputs. By framing this as an incentive Stackelberg game, the authors establish linear and linear-convex incentive structures that yield incentive controllability and allow the BiMPC to achieve team-optimal behavior without explicit knowledge of the LoMPC costs. They develop iterative, majorization-based algorithms to compute optimal incentives, prove convergence properties, and extend the approach to multiple LoMPCs with a robust BiMPC reformulation that guarantees bounded error relative to the team optimum. The dynamic price control example for EV charging demonstrates scalability to large EV populations and shows how ISO pricing can achieve valley-filling behavior while preserving feasibility. Overall, the framework provides scalable, privacy-preserving incentive design for hierarchical MPC with provable bounds and practical applicability to grid-scale dynamic pricing.

Abstract

In this paper, we discuss incentive design for hierarchical model predictive control (MPC) systems viewed as Stackelberg games. We consider a hierarchical MPC formulation where, given a lower-level convex MPC (LoMPC), the upper-level system solves a bilevel MPC (BiMPC) subject to the constraint that the lower-level system inputs are optimal for the LoMPC. Such hierarchical problems are challenging due to optimality constraints in the BiMPC formulation. We analyze an incentive Stackelberg game variation of the problem, where the BiMPC provides additional incentives for the LoMPC cost function, which grants the BiMPC influence over the LoMPC inputs. We show that for such problems, the BiMPC can be reformulated as a simpler optimization problem, and the optimal incentives can be iteratively computed without knowing the LoMPC cost function. We extend our formulation for the case of multiple LoMPCs and propose an algorithm that finds bounded suboptimal solutions for the BiMPC. We demonstrate our algorithm for a dynamic price control example, where an independent system operator (ISO) sets the electricity prices for electric vehicle (EV) charging with the goal of minimizing a social cost and satisfying electricity generation constraints. Notably, our method scales well to large EV population sizes.

Dynamic Incentive Selection for Hierarchical Convex Model Predictive Control

TL;DR

This work addresses the design of incentives in hierarchical MPC where an upper-level BiMPC leads a population of lower-level LoMPCs by providing incentives that steer their optimal inputs. By framing this as an incentive Stackelberg game, the authors establish linear and linear-convex incentive structures that yield incentive controllability and allow the BiMPC to achieve team-optimal behavior without explicit knowledge of the LoMPC costs. They develop iterative, majorization-based algorithms to compute optimal incentives, prove convergence properties, and extend the approach to multiple LoMPCs with a robust BiMPC reformulation that guarantees bounded error relative to the team optimum. The dynamic price control example for EV charging demonstrates scalability to large EV populations and shows how ISO pricing can achieve valley-filling behavior while preserving feasibility. Overall, the framework provides scalable, privacy-preserving incentive design for hierarchical MPC with provable bounds and practical applicability to grid-scale dynamic pricing.

Abstract

In this paper, we discuss incentive design for hierarchical model predictive control (MPC) systems viewed as Stackelberg games. We consider a hierarchical MPC formulation where, given a lower-level convex MPC (LoMPC), the upper-level system solves a bilevel MPC (BiMPC) subject to the constraint that the lower-level system inputs are optimal for the LoMPC. Such hierarchical problems are challenging due to optimality constraints in the BiMPC formulation. We analyze an incentive Stackelberg game variation of the problem, where the BiMPC provides additional incentives for the LoMPC cost function, which grants the BiMPC influence over the LoMPC inputs. We show that for such problems, the BiMPC can be reformulated as a simpler optimization problem, and the optimal incentives can be iteratively computed without knowing the LoMPC cost function. We extend our formulation for the case of multiple LoMPCs and propose an algorithm that finds bounded suboptimal solutions for the BiMPC. We demonstrate our algorithm for a dynamic price control example, where an independent system operator (ISO) sets the electricity prices for electric vehicle (EV) charging with the goal of minimizing a social cost and satisfying electricity generation constraints. Notably, our method scales well to large EV population sizes.

Paper Structure

This paper contains 43 sections, 8 theorems, 85 equations, 7 figures, 3 tables, 2 algorithms.

Key Result

Proposition 1

hiriart1993convex If a proper function $f: \mathbb{R}^n \rightarrow \mathbb{R} \cup \{\infty\}$ is $m$-strongly convex, then $\mathop{\mathrm{dom}}\nolimits{f^*} = \mathbb{R}^n$ and $f^*$ is $(1/m)$-smooth.

Figures (7)

  • Figure 1: A flowchart depicting the incentive hierarchical MPC formulation. Given the current state $\xi(t) = (x(t), \bm{y}(t))$, the bilevel controller (the BiMPC) provides upper-level input $u$ and an incentive to the lower-level systems, which compute the lower-level input $\bm{w}$ using the LoMPCs. We assume that the BiMPC can query the lower-level system inputs for different incentives. The controller output $(u, \bm{w})$ is then used to update the upper-level system state. The BiMPC and the LoMPCs together comprise the hierarchical MPC problem. In the EV charging example considered in Secs. \ref{['subsec:motivating-example']} and \ref{['sec:numerical-examples']}, the BiMPC is solved by an ISO to determine the amount of electricity generation, while the LoMPCs are solved by EVs to determine charging rates. The incentive provided by the ISO is the unit price of electricity.
  • Figure 2: A framework for solving the hierarchical MPC problem given by \ref{['eq:multiple-lompcs-incentivized-bimpc']} and \ref{['eq:linear-convex-incentivized-lompc']} (this figure corresponds to the controller subsystem in Fig. \ref{['fig:hierarchical-mpc-formulation']}). At each time step, given the current state $\xi(t)$, we first solve the robust BiMPC \ref{['eq:robust-bimpc']} to get the team-optimal solution $(u^*, \hat{w}^*)$. By the bounded incentive controllability property, Lem. \ref{['lem:bounded-incentive-controllability']}, we can find an incentive $\lambda^*$ such that the average LoMPC solution $w^*$ has bounded error compared to $\hat{w}^*$. The incentive solver uses the iterative method in Thm. \ref{['thm:linear-convex-incentives-iterative-method']} to compute $\lambda^*$. The output $(u^*, \bm{w}^*)$ is guaranteed to be feasible for the incentive BiMPC \ref{['eq:multiple-lompcs-incentivized-bimpc']}. In the flowchart, the solid lines indicate information flows that happen multiple times per time step, while the dashed lines indicate those that happen once.
  • Figure 3: Verification of the LoMPC solution error bound in \ref{['eq:ev-w-error-bound']} for large EVs with $M = 20$. For a given $\Delta y^l_0$, we randomly generate an incentive $\lambda^l$, compute the team-optimal solution $\hat{w}^l$ as the optimal solution of the EV LoMPC with $\theta^i = 0$ (see \ref{['eq:bounded-incentive-controllability-w-hat']}), and compute $w^l$ using \ref{['eq:ev-w-error-bound']}. The plot shows that for any desired average electricity consumption $\hat{w}^l$ (that is feasible), we can set the unit price of electricity $\lambda^l$ such that the actual average electricity consumption $w^l$ has a bounded error compared to $\hat{w}^l$. The error bound is given by $\sqrt{N} \Delta y^l_0$, where $\Delta y^l_0$ is the range of normalized SoC of the large EV population.
  • Figure 4: Verification of the incentive solver in Alg. \ref{['alg:iterative-method']}. The incentive solver uses the majorization-minimization method to compute an optimal incentive $\lambda^*$ iteratively. For each iterate $\lambda^{(k)}$, the dual cost $-\tilde{g}^*$ is majorized by $-\tilde{g}^*(\cdot \,; \lambda^{(k)})$, which is then minimized (see Thm. \ref{['thm:linear-convex-incentives-iterative-method']}). The solid line shows the actual decrease in the dual cost function $-\tilde{g}^*$ at each iteration, which is guaranteed to be greater than the decrease in $-\tilde{g}^*(\cdot \,; \lambda^{(k)})$ (the dashed line). The plot is shown for large EVs with $M = 200$.
  • Figure 5: The total electricity consumption by the EVs over $48$ hrs, normalized by $B$ (see \ref{['eq:ev-lompc-b']}). The solid line shows the actual electricity consumption $w$, while the dotted line shows the predicted electricity consumption $\hat{w}$ (the team-optimal solution from the BiMPC). The colored region shows the error bound; the actual consumption $w$ is guaranteed to be within the error bound range around the predicted consumption $\hat{w}$. The actual consumption reduces during peak external demand and rises when external demand is low.
  • ...and 2 more figures

Theorems & Definitions (28)

  • Proposition 1
  • Definition 1
  • Definition 2
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Definition 3
  • Remark 1
  • Lemma 1
  • ...and 18 more