Table of Contents
Fetching ...

CaRE: Finding Root Causes of Configuration Issues in Highly-Configurable Robots

Md Abir Hossen, Sonam Kharade, Bradley Schmerl, Javier Cámara, Jason M. O'Kane, Ellen C. Czaplinski, Katherine A. Dzurilla, David Garlan, Pooyan Jamshidi

TL;DR

CaRE tackles root-cause debugging in highly configurable robots by learning a causal model that links configuration options to performance objectives. It uses a three-layer model and constraint-aware causal discovery (FCI) to build a Partial Ancestral Graph, then extracts and ranks causal paths via average causal effects (ACE) to identify high-impact root causes. The approach is validated on Husky and Turtlebot 3 across simulation and real-world deployments, showing strong accuracy and transferability, and outperforming a baseline correlational method (CBI). CaRE demonstrates practical utility by enabling targeted debugging, transferring learned models across platforms, and offering a structured workflow for causal debugging in robotics.

Abstract

Robotic systems have subsystems with a combinatorially large configuration space and hundreds or thousands of possible software and hardware configuration options interacting non-trivially. The configurable parameters are set to target specific objectives, but they can cause functional faults when incorrectly configured. Finding the root cause of such faults is challenging due to the exponentially large configuration space and the dependencies between the robot's configuration settings and performance. This paper proposes CaRE -- a method for diagnosing the root cause of functional faults through the lens of causality. CaRE abstracts the causal relationships between various configuration options and the robot's performance objectives by learning a causal structure and estimating the causal effects of options on robot performance indicators. We demonstrate CaRE's efficacy by finding the root cause of the observed functional faults and validating the diagnosed root cause by conducting experiments in both physical robots (Husky and Turtlebot 3) and in simulation (Gazebo). Furthermore, we demonstrate that the causal models learned from robots in simulation (e.g., Husky in Gazebo) are transferable to physical robots across different platforms (e.g., Husky and Turtlebot 3).

CaRE: Finding Root Causes of Configuration Issues in Highly-Configurable Robots

TL;DR

CaRE tackles root-cause debugging in highly configurable robots by learning a causal model that links configuration options to performance objectives. It uses a three-layer model and constraint-aware causal discovery (FCI) to build a Partial Ancestral Graph, then extracts and ranks causal paths via average causal effects (ACE) to identify high-impact root causes. The approach is validated on Husky and Turtlebot 3 across simulation and real-world deployments, showing strong accuracy and transferability, and outperforming a baseline correlational method (CBI). CaRE demonstrates practical utility by enabling targeted debugging, transferring learned models across platforms, and offering a structured workflow for causal debugging in robotics.

Abstract

Robotic systems have subsystems with a combinatorially large configuration space and hundreds or thousands of possible software and hardware configuration options interacting non-trivially. The configurable parameters are set to target specific objectives, but they can cause functional faults when incorrectly configured. Finding the root cause of such faults is challenging due to the exponentially large configuration space and the dependencies between the robot's configuration settings and performance. This paper proposes CaRE -- a method for diagnosing the root cause of functional faults through the lens of causality. CaRE abstracts the causal relationships between various configuration options and the robot's performance objectives by learning a causal structure and estimating the causal effects of options on robot performance indicators. We demonstrate CaRE's efficacy by finding the root cause of the observed functional faults and validating the diagnosed root cause by conducting experiments in both physical robots (Husky and Turtlebot 3) and in simulation (Gazebo). Furthermore, we demonstrate that the causal models learned from robots in simulation (e.g., Husky in Gazebo) are transferable to physical robots across different platforms (e.g., Husky and Turtlebot 3).
Paper Structure (49 sections, 4 equations, 10 figures, 8 tables, 2 algorithms)

This paper contains 49 sections, 4 equations, 10 figures, 8 tables, 2 algorithms.

Figures (10)

  • Figure 1: An example showing the effectiveness of causality in reasoning about the robot's behavior. (a) Observational data (incorrectly) shows that an increase in the planner failure rate for producing a path leads to a higher probability of mission success; (b) incorporating obstacle cost along the trajectory as a confounder correctly shows an increase in planner failure corresponding to a decrease in probability of mission success (negative correlation); (c) the causal model correctly captures obstacle cost as a common cause to explain the robot's behavior.
  • Figure 2: Different functional faults. (a) Delay in data transform results in a functional fault where the robot stops $0.5 m$ away from the target location and transmits incorrect artifact locations; (b) change in environment results in an indecisive robot that is stuck in place, where circles surrounding the robot represents the inflation radius.
  • Figure 3: OWLAT hardware platform performing excavation task; (a) the excavation task failed due to the robotic arm was unable to find the surface, (b) the desire excavation operation without the fault.
  • Figure 4: Overview of CaRE
  • Figure 5: Experimental environments, (a) simulated in Gazebo, (b) a real environment located at the University of South Carolina.
  • ...and 5 more figures