Generative Skill Chaining: Long-Horizon Skill Planning with Diffusion Models

Utkarsh A. Mishra; Shangjie Xue; Yongxin Chen; Danfei Xu

Generative Skill Chaining: Long-Horizon Skill Planning with Diffusion Models

Utkarsh A. Mishra, Shangjie Xue, Yongxin Chen, Danfei Xu

TL;DR

Generative Skill Chaining (GSC) introduces a diffusion-based, generative framework for long-horizon manipulation planning that operates on skill-level distributions. By training separate diffusion models for each primitive and composing them via forward-backward conditioning with a dependency factor, GSC enables parallel, constraint-aware sampling of feasible skill sequences for unseen task skeletons. The approach demonstrates robust constraint satisfaction, generalization to longer horizons and perturbations, and successful real-world deployment, highlighting scalability over prior greedy or search-based methods. This work offers a practical path toward flexible, long-horizon planning in robotics by leveraging probabilistic compositionality and test-time adaptability.

Abstract

Long-horizon tasks, usually characterized by complex subtask dependencies, present a significant challenge in manipulation planning. Skill chaining is a practical approach to solving unseen tasks by combining learned skill priors. However, such methods are myopic if sequenced greedily and face scalability issues with search-based planning strategy. To address these challenges, we introduce Generative Skill Chaining~(GSC), a probabilistic framework that learns skill-centric diffusion models and composes their learned distributions to generate long-horizon plans during inference. GSC samples from all skill models in parallel to efficiently solve unseen tasks while enforcing geometric constraints. We evaluate the method on various long-horizon tasks and demonstrate its capability in reasoning about action dependencies, constraint handling, and generalization, along with its ability to replan in the face of perturbations. We show results in simulation and on real robot to validate the efficiency and scalability of GSC, highlighting its potential for advancing long-horizon task planning. More details are available at: https://generative-skill-chaining.github.io/

Generative Skill Chaining: Long-Horizon Skill Planning with Diffusion Models

TL;DR

Abstract

Paper Structure (12 sections, 12 equations, 16 figures, 3 tables, 1 algorithm)

This paper contains 12 sections, 12 equations, 16 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Preliminaries
Methodology
Results
Data Collection and Real-World Experiment Details
Limitations
Conclusion
Summary of the Algorithm
Additional Discussion
Task Descriptions
Additional Results

Figures (16)

Figure 1: (Top) Generative Skill Chaining (GSC) aims to solve a long-horizon task for a given sequence of skills by using linear probabilistic chains to parallelly sample from a joint distribution of multiple skill-specific transitions $q_\pi(\textbf{s}, \textbf{a}, \textbf{s}')$ learned using diffusion models. The framework implicitly considers transition feasibility and subsequent skill affordability while demonstrating flexible constraint-handling abilities. (Bottom) An example of a long-horizon TAMP problem composed of multiple skills. Such a task necessitates reasoning inter-dependencies between actions.
Figure 2: (a) A linear chain graph for a long sequence of transitions and (b) adding an additional constraint node.
Figure 3: Left The primitive skills and their executions are shown with the objects of interest. Right Transformer-based skill diffusion model. We use the noisy state-action-state distribution $\textbf{x}_t \sim \{\textbf{s}_t, \textbf{a}_t, \textbf{s}'_t\}$ at diffusion step $t$ to obtain the corresponding $\epsilon_\theta$ during sampling. The skill object order depends on the objects of interest and is represented as a collection of one-hot vectors.
Figure 4: Toy Domain: We model four distributions of states and segment one of them into left and right segments. The above figure illustrates diffusion model composition using GSC with fixed start state and unit actions followed by the addition of soft and hard constraint guidance.
Figure 5: For a constrained packing task of picking-and-placing all the boxes on the rack: Task agnostic secondary placement objectives help in realizing accurate and consistent state-action sequences in Top Simulation and Bottom open loop hardware rollouts.
...and 11 more figures

Generative Skill Chaining: Long-Horizon Skill Planning with Diffusion Models

TL;DR

Abstract

Generative Skill Chaining: Long-Horizon Skill Planning with Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (16)