Primal-Only Actor Critic Algorithm for Robust Constrained Average Cost MDPs

Anirudh Satheesh; Sooraj Sathish; Swetha Ganesh; Keenan Powell; Vaneet Aggarwal

Primal-Only Actor Critic Algorithm for Robust Constrained Average Cost MDPs

Anirudh Satheesh, Sooraj Sathish, Swetha Ganesh, Keenan Powell, Vaneet Aggarwal

TL;DR

This work proposes an actor-critic algorithm for Average-Cost RCMDPs that achieves both \(\epsilon\)-feasibility and \(\epsilon\)-optimality, and establishes a sample complexities of \(\tilde{O}\left(\epsilon^{-4}\right)\) and \(\tilde{O}\left(\epsilon^{-6}\right)\) with and without slackness assumption, which is comparable to the discounted setting.

Abstract

In this work, we study the problem of finding robust and safe policies in Robust Constrained Average-Cost Markov Decision Processes (RCMDPs). A key challenge in this setting is the lack of strong duality, which prevents the direct use of standard primal-dual methods for constrained RL. Additional difficulties arise from the average-cost setting, where the Robust Bellman operator is not a contraction under any norm. To address these challenges, we propose an actor-critic algorithm for Average-Cost RCMDPs. We show that our method achieves both \(ε\)-feasibility and \(ε\)-optimality, and we establish a sample complexities of \(\tilde{O}\left(ε^{-4}\right)\) and \(\tilde{O}\left(ε^{-6}\right)\) with and without slackness assumption, which is comparable to the discounted setting.

Primal-Only Actor Critic Algorithm for Robust Constrained Average Cost MDPs

TL;DR

This work proposes an actor-critic algorithm for Average-Cost RCMDPs that achieves both

-feasibility and

-optimality, and establishes a sample complexities of \(\tilde{O}\left(\epsilon^{-4}\right)\) and \(\tilde{O}\left(\epsilon^{-6}\right)\) with and without slackness assumption, which is comparable to the discounted setting.

Abstract

-feasibility and

-optimality, and we establish a sample complexities of \(\tilde{O}\left(ε^{-4}\right)\) and \(\tilde{O}\left(ε^{-6}\right)\) with and without slackness assumption, which is comparable to the discounted setting.

Primal-Only Actor Critic Algorithm for Robust Constrained Average Cost MDPs

TL;DR

Abstract

Primal-Only Actor Critic Algorithm for Robust Constrained Average Cost MDPs

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (13)