Table of Contents
Fetching ...

DRIFT: Diffusion-based Rule-Inferred For Trajectories

Jinyang Zhao, Handong Zheng, Yanjiu Zhong, Qiang Zhang, Yu Kang, Shunyu Wu

TL;DR

This article proposes DRIFT (Diffusion-based Rule-Inferred for Trajectories), a conditional diffusion framework designed to generate high-fidelity reference trajectories by integrating two complementary inductive biases, achieving centimeter-level imitation fidelity and competitive smoothness.

Abstract

Trajectory generation for mobile robots in unstructured environments faces a critical dilemma: balancing kinematic smoothness for safe execution with terminal precision for fine-grained tasks. Existing generative planners often struggle with this trade-off, yielding either smooth but imprecise paths or geometrically accurate but erratic motions. To address the aforementioned shortcomings, this article proposes DRIFT (Diffusion-based Rule-Inferred for Trajectories), a conditional diffusion framework designed to generate high-fidelity reference trajectories by integrating two complementary inductive biases. First, a Relational Inductive Bias, realized via a GNN-based Structured Scene Perception (SSP) module, encodes global topological constraints to ensure holistic smoothness. Second, a Temporal Attention Bias, implemented through a novel Graph-Conditioned Time-Aware GRU (GTGRU), dynamically attends to sparse obstacles and targets for precise local maneuvering. In the end, quantitative results demonstrate that DRIFT reconciles these conflicting objectives, achieving centimeter-level imitation fidelity (0.041m FDE) and competitive smoothness (27.19 Jerk). This balance yields highly executable reference plans for downstream control.

DRIFT: Diffusion-based Rule-Inferred For Trajectories

TL;DR

This article proposes DRIFT (Diffusion-based Rule-Inferred for Trajectories), a conditional diffusion framework designed to generate high-fidelity reference trajectories by integrating two complementary inductive biases, achieving centimeter-level imitation fidelity and competitive smoothness.

Abstract

Trajectory generation for mobile robots in unstructured environments faces a critical dilemma: balancing kinematic smoothness for safe execution with terminal precision for fine-grained tasks. Existing generative planners often struggle with this trade-off, yielding either smooth but imprecise paths or geometrically accurate but erratic motions. To address the aforementioned shortcomings, this article proposes DRIFT (Diffusion-based Rule-Inferred for Trajectories), a conditional diffusion framework designed to generate high-fidelity reference trajectories by integrating two complementary inductive biases. First, a Relational Inductive Bias, realized via a GNN-based Structured Scene Perception (SSP) module, encodes global topological constraints to ensure holistic smoothness. Second, a Temporal Attention Bias, implemented through a novel Graph-Conditioned Time-Aware GRU (GTGRU), dynamically attends to sparse obstacles and targets for precise local maneuvering. In the end, quantitative results demonstrate that DRIFT reconciles these conflicting objectives, achieving centimeter-level imitation fidelity (0.041m FDE) and competitive smoothness (27.19 Jerk). This balance yields highly executable reference plans for downstream control.
Paper Structure (29 sections, 8 equations, 5 figures, 1 table, 1 algorithm)

This paper contains 29 sections, 8 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: DRIFT fuses relational topological bias (SSP) and temporal attention bias (GTGRU) to produce high-fidelity reference plans. The Denoising Network (DN) then iteratively refines trajectories from these structured encodings, balancing smoothness and precision.
  • Figure 2: Voxel downsampling converts the dense LiDAR point cloud (green) into sparse graph nodes (red) (Alg. \ref{['alg:ssp_module']}), cutting computational cost and producing structured topologies for GNN relational reasoning.
  • Figure 3: Terminal accuracy comparison for close-proximity docking. Our method, DRIFT (purple), maintains 92.55% success at 0.1m, while the DTG baseline (blue) fails (45.49%). This result demonstrates DRIFT's ability to resolve the "precision-smoothness trade-off" by achieving high-precision local maneuvering.
  • Figure 4: Predicted trajectories (dark) vs. ground truth (light gray) across models: DTG is smooth but deviates and misses terminals; CAVE and BC are erratic or imprecise. DRIFT resolves the precision–smoothness trade-off, matching the ground truth while remaining smooth.
  • Figure 5: Predicted trajectory (dark) closely matches ground truth (light gray) in (a) trajectory, (b) 2D map, and (c) point-cloud views, showing DRIFT reconciles precision and smoothness.