Table of Contents
Fetching ...

Leveraging Counterfactual Paths for Contrastive Explanations of POMDP Policies

Benjamin Kraske, Zakariya Laouar, Zachary Sunberg

TL;DR

This work tackles the problem of making POMDP policies transparent by using user-provided counterfactual paths to generate contrastive explanations. It defines feature occupancy and feature expectations to compare an open-loop counterfactual policy against a near-optimal SAR POMDP policy, enabling interpretable distinctions in terms of problem objectives and constraints. Using SARSOP to obtain an approximate optimal policy, the approach demonstrates how differences in feature expectations reveal why certain paths are chosen or avoided, especially under uncertainty and resource limits. Through two SAR case studies, the paper shows how observable versus unobservable objectives and battery constraints shape explanations, contributing to trust and effective human–robot collaboration in autonomous search tasks.

Abstract

As humans come to rely on autonomous systems more, ensuring the transparency of such systems is important to their continued adoption. Explainable Artificial Intelligence (XAI) aims to reduce confusion and foster trust in systems by providing explanations of agent behavior. Partially observable Markov decision processes (POMDPs) provide a flexible framework capable of reasoning over transition and state uncertainty, while also being amenable to explanation. This work investigates the use of user-provided counterfactuals to generate contrastive explanations of POMDP policies. Feature expectations are used as a means of contrasting the performance of these policies. We demonstrate our approach in a Search and Rescue (SAR) setting. We analyze and discuss the associated challenges through two case studies.

Leveraging Counterfactual Paths for Contrastive Explanations of POMDP Policies

TL;DR

This work tackles the problem of making POMDP policies transparent by using user-provided counterfactual paths to generate contrastive explanations. It defines feature occupancy and feature expectations to compare an open-loop counterfactual policy against a near-optimal SAR POMDP policy, enabling interpretable distinctions in terms of problem objectives and constraints. Using SARSOP to obtain an approximate optimal policy, the approach demonstrates how differences in feature expectations reveal why certain paths are chosen or avoided, especially under uncertainty and resource limits. Through two SAR case studies, the paper shows how observable versus unobservable objectives and battery constraints shape explanations, contributing to trust and effective human–robot collaboration in autonomous search tasks.

Abstract

As humans come to rely on autonomous systems more, ensuring the transparency of such systems is important to their continued adoption. Explainable Artificial Intelligence (XAI) aims to reduce confusion and foster trust in systems by providing explanations of agent behavior. Partially observable Markov decision processes (POMDPs) provide a flexible framework capable of reasoning over transition and state uncertainty, while also being amenable to explanation. This work investigates the use of user-provided counterfactuals to generate contrastive explanations of POMDP policies. Feature expectations are used as a means of contrasting the performance of these policies. We demonstrate our approach in a Search and Rescue (SAR) setting. We analyze and discuss the associated challenges through two case studies.
Paper Structure (13 sections, 3 equations, 1 figure, 2 tables)

This paper contains 13 sections, 3 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Case studies. (a), (b) An example in which the readily observable objective and the more valuable, partially observable, objective do not align. Note the target location (orange star) is unknown initially and only discovered by the robot (blue circle) after the optimal policy is executed. (c), (d) An example in which constraints restrict the feasibility of a proposed user policy. The black arrows represent the executed actions while the gray arrows represent the remaining actions of the user counterfactual path that were not executed due to the agent reaching a terminal battery state.