Robust Permissive Controller Synthesis for Interval MDPs
Khang Vo Huynh, David Parker, Lu Feng
TL;DR
This work addresses robust permissive controller synthesis for robots under uncertain dynamics modeled as Interval MDPs, where transition probabilities vary within intervals. It formulates the synthesis problem as mixed-integer linear programs and introduces two encodings: a vertex enumeration encoding that explicitly handles extreme points of the uncertainty polytopes, and a dualization-based encoding that avoids enumeration by solving the worst-case transition distribution via robust optimization. Both encodings guarantee that every compliant multi-strategy satisfies reachability or reward-based specifications under all admissible transitions and aim to maximize permissiveness. Empirical results on four robotic benchmark domains demonstrate that the methods yield robust, maximally permissive controllers and scale to IMDPs with hundreds of thousands of states, with the dualization approach offering superior scalability on large models.
Abstract
We address the problem of robust permissive controller synthesis for robots operating under uncertain dynamics, modeled as Interval Markov Decision Processes (IMDPs). IMDPs generalize standard MDPs by allowing transition probabilities to vary within intervals, capturing epistemic uncertainty from sensing noise, actuation imprecision, and coarse system abstractions-common in robotics. Traditional controller synthesis typically yields a single deterministic strategy, limiting adaptability. In contrast, permissive controllers (multi-strategies) allow multiple actions per state, enabling runtime flexibility and resilience. However, prior work on permissive controller synthesis generally assumes exact transition probabilities, which is unrealistic in many robotic applications. We present the first framework for robust permissive controller synthesis on IMDPs, guaranteeing that all strategies compliant with the synthesized multi-strategy satisfy reachability or reward-based specifications under all admissible transitions. We formulate the problem as mixed-integer linear programs (MILPs) and propose two encodings: a baseline vertex-enumeration method and a scalable duality-based method that avoids explicit enumeration. Experiments on four benchmark domains show that both methods synthesize robust, maximally permissive controllers and scale to large IMDPs with up to hundreds of thousands of states.
