Table of Contents
Fetching ...

Optimal Policy Synthesis from A Sequence of Goal Sets with An Application to Electric Distribution System Restoration

İlker Işık, Onur Yigit Arpali, Ebru Aydin Gol

TL;DR

This paper addresses optimal policy synthesis for MDPs when driven by a sequence of priority-ordered goal sets. It introduces an iterative action-filtering approach that first preserves actions achieving maximal probability to reach each goal, then among those, retains actions that minimize the expected time to reach the goal, ultimately selecting a policy that minimizes the infinite-horizon cost within the filtered action set. The method is applied to post-disaster distribution-system restoration, modeling bus statuses, energization actions, and constraints, and demonstrates improvements over prior approaches in both synthetic and a real 17-bus system. The work highlights the practical utility of goal-sequence planning for prioritized, time-sensitive restoration tasks and suggests avenues for extending the framework to additional criteria and goal-selection strategies.

Abstract

Motivated by the post-disaster distribution system restoration problem, in this paper, we study the problem of synthesizing the optimal policy for a Markov Decision Process (MDP) from a sequence of goal sets. For each goal set, our aim is to both maximize the probability to reach and minimize the expected time to reach the goal set. The order of the goal sets represents their priority. In particular, our aim is to generate a policy that is optimal with respect to the first goal set, and it is optimal with respect to the second goal set among the policies that are optimal with respect to the first goal set and so on. To synthesize such a policy, we iteratively filter the applicable actions according to the goal sets. We illustrate the developed method over sample distribution systems and disaster scenarios.

Optimal Policy Synthesis from A Sequence of Goal Sets with An Application to Electric Distribution System Restoration

TL;DR

This paper addresses optimal policy synthesis for MDPs when driven by a sequence of priority-ordered goal sets. It introduces an iterative action-filtering approach that first preserves actions achieving maximal probability to reach each goal, then among those, retains actions that minimize the expected time to reach the goal, ultimately selecting a policy that minimizes the infinite-horizon cost within the filtered action set. The method is applied to post-disaster distribution-system restoration, modeling bus statuses, energization actions, and constraints, and demonstrates improvements over prior approaches in both synthetic and a real 17-bus system. The work highlights the practical utility of goal-sequence planning for prioritized, time-sensitive restoration tasks and suggests avenues for extending the framework to additional criteria and goal-selection strategies.

Abstract

Motivated by the post-disaster distribution system restoration problem, in this paper, we study the problem of synthesizing the optimal policy for a Markov Decision Process (MDP) from a sequence of goal sets. For each goal set, our aim is to both maximize the probability to reach and minimize the expected time to reach the goal set. The order of the goal sets represents their priority. In particular, our aim is to generate a policy that is optimal with respect to the first goal set, and it is optimal with respect to the second goal set among the policies that are optimal with respect to the first goal set and so on. To synthesize such a policy, we iteratively filter the applicable actions according to the goal sets. We illustrate the developed method over sample distribution systems and disaster scenarios.
Paper Structure (11 sections, 16 equations, 2 figures, 12 tables, 1 algorithm)

This paper contains 11 sections, 16 equations, 2 figures, 12 tables, 1 algorithm.

Figures (2)

  • Figure 1: A distribution system and the probability of failure ($P_f$) values for each bus.
  • Figure 2: 17-bus distribution system and the probability of failure ($P_f$) values for each bus.