CAPE: Context-Aware Diffusion Policy Via Proximal Mode Expansion for Collision Avoidance

Rui Heng Yang; Xuan Zhao; Leo Maxime Brunswic; Montgomery Alban; Mateo Clemente; Tongtong Cao; Jun Jin; Amir Rasouli

CAPE: Context-Aware Diffusion Policy Via Proximal Mode Expansion for Collision Avoidance

Rui Heng Yang, Xuan Zhao, Leo Maxime Brunswic, Montgomery Alban, Mateo Clemente, Tongtong Cao, Jun Jin, Amir Rasouli

TL;DR

CAP E introduces Context-Aware diffusion policy via Proximal mode Expansion (CAPE) to overcome diffusion-based robotics' limited trajectory multimodality. By fusing a context-aware prior, derived from previous trajectory segments, with training-free, context-guided denoising, CAPE iteratively expands trajectory modes to generate collision-free, goal-consistent plans in unseen environments. The method maintains task intent while enlarging the distributional support, demonstrated by significant performance gains over state-of-the-art methods in both simulated and real-world cluttered tasks. These results suggest CAPE enables robust generalization for collision avoidance without extensive obstacle-covered datasets or heavy online optimization.

Abstract

In robotics, diffusion models can capture multi-modal trajectories from demonstrations, making them a transformative approach in imitation learning. However, achieving optimal performance following this regiment requires a large-scale dataset, which is costly to obtain, especially for challenging tasks, such as collision avoidance. In those tasks, generalization at test time demands coverage of many obstacles types and their spatial configurations, which are impractical to acquire purely via data. To remedy this problem, we propose Context-Aware diffusion policy via Proximal mode Expansion (CAPE), a framework that expands trajectory distribution modes with context-aware prior and guidance at inference via a novel prior-seeded iterative guided refinement procedure. The framework generates an initial trajectory plan and executes a short prefix trajectory, and then the remaining trajectory segment is perturbed to an intermediate noise level, forming a trajectory prior. Such a prior is context-aware and preserves task intent. Repeating the process with context-aware guided denoising iteratively expands mode support to allow finding smoother, less collision-prone trajectories. For collision avoidance, CAPE expands trajectory distribution modes with collision-aware context, enabling the sampling of collision-free trajectories in previously unseen environments while maintaining goal consistency. We evaluate CAPE on diverse manipulation tasks in cluttered unseen simulated and real-world settings and show up to 26% and 80% higher success rates respectively compared to SOTA methods, demonstrating better generalization to unseen environments.

CAPE: Context-Aware Diffusion Policy Via Proximal Mode Expansion for Collision Avoidance

TL;DR

Abstract

CAPE: Context-Aware Diffusion Policy Via Proximal Mode Expansion for Collision Avoidance

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)