Table of Contents
Fetching ...

Hierarchical Event-Triggered Systems: Safe Learning of Quasi-Optimal Deadline Policies

Pio Ong, Manuel Mazo, Aaron D. Ames

TL;DR

This paper considers event-triggered systems generally as an impulsive control system in which the objective is to minimize the number of impulses, and presents a hierarchical architecture to improve the efficiency of event-triggered control in reducing resource consumption.

Abstract

We present a hierarchical architecture to improve the efficiency of event-triggered control (ETC) in reducing resource consumption. This paper considers event-triggered systems generally as an impulsive control system in which the objective is to minimize the number of impulses. Our architecture recognizes that traditional ETC is a greedy strategy towards optimizing average inter-event times and introduces the idea of a deadline policy for the optimization of long-term discounted inter-event times. A lower layer is designed employing event-triggered control to guarantee the satisfaction of control objectives, while a higher layer implements a deadline policy designed with reinforcement learning to improve the discounted inter-event time. We apply this scheme to the control of an orbiting spacecraft, showing superior performance in terms of actuation frequency reduction with respect to a standard (one-layer) ETC while maintaining safety guarantees.

Hierarchical Event-Triggered Systems: Safe Learning of Quasi-Optimal Deadline Policies

TL;DR

This paper considers event-triggered systems generally as an impulsive control system in which the objective is to minimize the number of impulses, and presents a hierarchical architecture to improve the efficiency of event-triggered control in reducing resource consumption.

Abstract

We present a hierarchical architecture to improve the efficiency of event-triggered control (ETC) in reducing resource consumption. This paper considers event-triggered systems generally as an impulsive control system in which the objective is to minimize the number of impulses. Our architecture recognizes that traditional ETC is a greedy strategy towards optimizing average inter-event times and introduces the idea of a deadline policy for the optimization of long-term discounted inter-event times. A lower layer is designed employing event-triggered control to guarantee the satisfaction of control objectives, while a higher layer implements a deadline policy designed with reinforcement learning to improve the discounted inter-event time. We apply this scheme to the control of an orbiting spacecraft, showing superior performance in terms of actuation frequency reduction with respect to a standard (one-layer) ETC while maintaining safety guarantees.
Paper Structure (13 sections, 2 theorems, 25 equations, 4 figures, 1 algorithm)

This paper contains 13 sections, 2 theorems, 25 equations, 4 figures, 1 algorithm.

Key Result

Proposition 1

(Equivalent Deadline Policy): Consider the triggering conditions $\Xi$ and $\Xi'$ such that $\Xi$ dominates $\Xi'$. Then, there exists a deadline policy $\mathcal{D}:\mathcal{X} \rightarrow \mathbb{R}_{\ge 0}$ such that the ETC scheme eq:trigger using $\Xi'$ and the scheme eq:trigger_deadline using

Figures (4)

  • Figure 1: Schematic description of the layered ETC architecture
  • Figure 2: Improvement of DIET over generations of learning. For each generation, we plot the average (red) of DIET of the 100 trajectories within the generation, and the error bar (blue) encompasses the range of DIET observed. The DIET axis is displayed in logarithmic scale to also reveal the worst-case.
  • Figure 3: Learned deadline policy. The plot shows the deadline values for each of the 400 state buckets. For radii close to the boundary of the safe set, the learned deadline is much lower than 100 hours, which is used by the traditional ETC (greedy) policy.
  • Figure 4: Ten trajectory comparisons between the greedy deadline policy (top) and learned deadline policy (bottom). In both cases, the underlying ETC strictly enforces safety. However, the learned policy intelligently triggers in the way that trajectories dwell in the region where inter-event times are longer, leading to an overall increase in DIET.

Theorems & Definitions (9)

  • Definition 1
  • Definition 2
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Remark 1
  • Remark 2
  • Remark 3