Table of Contents
Fetching ...

SkillTree: Explainable Skill-Based Deep Reinforcement Learning for Long-Horizon Control Tasks

Yongyan Wen, Siyuan Li, Rongchang Zuo, Lei Yuan, Hangyu Mao, Peng Liu

TL;DR

SkillTree is proposed, a novel hierarchical framework that reduces the complex continuous action space of challenging control tasks into discrete skill space and significantly enhances the transparency and explainability of the decision-making process.

Abstract

Deep reinforcement learning (DRL) has achieved remarkable success in various research domains. However, its reliance on neural networks results in a lack of transparency, which limits its practical applications. To achieve explainability, decision trees have emerged as a popular and promising alternative to neural networks. Nonetheless, due to their limited expressiveness, traditional decision trees struggle with high-dimensional long-horizon continuous control tasks. In this paper, we proposes SkillTree, a novel framework that reduces complex continuous action spaces into discrete skill spaces. Our hierarchical approach integrates a differentiable decision tree within the high-level policy to generate skill embeddings, which subsequently guide the low-level policy in executing skills. By making skill decisions explainable, we achieve skill-level explainability, enhancing the understanding of the decision-making process in complex tasks. Experimental results demonstrate that our method achieves performance comparable to skill-based neural networks in complex robotic arm control domains. Furthermore, SkillTree offers explanations at the skill level, thereby increasing the transparency of the decision-making process.

SkillTree: Explainable Skill-Based Deep Reinforcement Learning for Long-Horizon Control Tasks

TL;DR

SkillTree is proposed, a novel hierarchical framework that reduces the complex continuous action space of challenging control tasks into discrete skill space and significantly enhances the transparency and explainability of the decision-making process.

Abstract

Deep reinforcement learning (DRL) has achieved remarkable success in various research domains. However, its reliance on neural networks results in a lack of transparency, which limits its practical applications. To achieve explainability, decision trees have emerged as a popular and promising alternative to neural networks. Nonetheless, due to their limited expressiveness, traditional decision trees struggle with high-dimensional long-horizon continuous control tasks. In this paper, we proposes SkillTree, a novel framework that reduces complex continuous action spaces into discrete skill spaces. Our hierarchical approach integrates a differentiable decision tree within the high-level policy to generate skill embeddings, which subsequently guide the low-level policy in executing skills. By making skill decisions explainable, we achieve skill-level explainability, enhancing the understanding of the decision-making process in complex tasks. Experimental results demonstrate that our method achieves performance comparable to skill-based neural networks in complex robotic arm control domains. Furthermore, SkillTree offers explanations at the skill level, thereby increasing the transparency of the decision-making process.

Paper Structure

This paper contains 21 sections, 6 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: Comparison of the soft decision tree (left) and the hard decision tree (right).
  • Figure 2: Discrete skill embedding learning and downstream high-level DT policy learning. After completing the skill learning, we freeze the decoder and skill prior, and then proceed to finetune the codebook during the high-level policy learning.
  • Figure 3: Four long-horizon sparse reward tasks to evaluate. (a) The robotic arm has to finish four subtasks in the correct order, i.e., Microwave - Kettle - Bottom Burner - Light (MKBL). (b) Similar to (a), but with different subtasks: Microwave - Ligt - Slide Cabinet - Hinge Cabinet (MLSH). (c) Finish subtasks in the correct order, i.e., Open Drawer - Turn on Lightbulb - Move Slider Left - Turn on LED. (d) In the office cleaning task, the robotic arm needs to pick up objects and place them in corresponding containers in sequence.
  • Figure 4: Downstream task learning curves of both our method and baselines. Averaged over 5 independent runs.
  • Figure 5: Visualization of the SkillTree (DC+D) with depth 3. qpos and qpos_obj mean the position of the robotic arm and objects, respectively. $n$ denotes the number of state-skill pairs in the divided decision set.
  • ...and 2 more figures