Table of Contents
Fetching ...

DTCCL: Disengagement-Triggered Contrastive Continual Learning for Autonomous Bus Planners

Yanding Yang, Weitao Zhou, Jinhai Wang, Xiaomin Guo, Junze Wen, Xiaolong Liu, Lang Ding, Zheng Fu, Jinyu Miao, Kun Jiang, Diange Yang

TL;DR

DTCCL tackles planner-level failures in autonomous buses by exploiting real-world disengagement events in a closed-loop cloud-edge framework. It introduces disengagement-triggered data augmentation and a triplet contrastive objective to refine policy representations while preserving prior behaviors. Empirical results on real routes and nuPlan-style benchmarks show a substantial performance boost and reduced collision rates compared with direct imitation learning and naive augmentation. The approach enables scalable, automated policy improvement for public transport in dynamic urban environments.

Abstract

Autonomous buses run on fixed routes but must operate in open, dynamic urban environments. Disengagement events on these routes are often geographically concentrated and typically arise from planner failures in highly interactive regions. Such policy-level failures are difficult to correct using conventional imitation learning, which easily overfits to sparse disengagement data. To address this issue, this paper presents a Disengagement-Triggered Contrastive Continual Learning (DTCCL) framework that enables autonomous buses to improve planning policies through real-world operation. Each disengagement triggers cloud-based data augmentation that generates positive and negative samples by perturbing surrounding agents while preserving route context. Contrastive learning refines policy representations to better distinguish safe and unsafe behaviors, and continual updates are applied in a cloud-edge loop without human supervision. Experiments on urban bus routes demonstrate that DTCCL improves overall planning performance by 48.6 percent compared with direct retraining, validating its effectiveness for scalable, closed-loop policy improvement in autonomous public transport.

DTCCL: Disengagement-Triggered Contrastive Continual Learning for Autonomous Bus Planners

TL;DR

DTCCL tackles planner-level failures in autonomous buses by exploiting real-world disengagement events in a closed-loop cloud-edge framework. It introduces disengagement-triggered data augmentation and a triplet contrastive objective to refine policy representations while preserving prior behaviors. Empirical results on real routes and nuPlan-style benchmarks show a substantial performance boost and reduced collision rates compared with direct imitation learning and naive augmentation. The approach enables scalable, automated policy improvement for public transport in dynamic urban environments.

Abstract

Autonomous buses run on fixed routes but must operate in open, dynamic urban environments. Disengagement events on these routes are often geographically concentrated and typically arise from planner failures in highly interactive regions. Such policy-level failures are difficult to correct using conventional imitation learning, which easily overfits to sparse disengagement data. To address this issue, this paper presents a Disengagement-Triggered Contrastive Continual Learning (DTCCL) framework that enables autonomous buses to improve planning policies through real-world operation. Each disengagement triggers cloud-based data augmentation that generates positive and negative samples by perturbing surrounding agents while preserving route context. Contrastive learning refines policy representations to better distinguish safe and unsafe behaviors, and continual updates are applied in a cloud-edge loop without human supervision. Experiments on urban bus routes demonstrate that DTCCL improves overall planning performance by 48.6 percent compared with direct retraining, validating its effectiveness for scalable, closed-loop policy improvement in autonomous public transport.

Paper Structure

This paper contains 32 sections, 7 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Closed-loop continual learning framework for autonomous bus planning. The driving policy is first pretrained via imitation learning on large-scale human driving data. During deployment, disengagement events are automatically recorded and used to trigger policy learning, enabling the planner to achieve improved performance when encountering similar scenarios in the future.
  • Figure 2: The framework of proposed DTCCL method. A base policy is first pretrained via imitation learning on large-scale human driving data. During deployment, disengagement events are recorded and transformed into positive and negative variants through safety-aware augmentation. Contrastive learning is then applied to improve representation robustness, enabling continual policy refinement across deployment cycles.
  • Figure 3: Autonomous bus platform equipped with multi-sensor perception, including LiDARs, short- and medium-range cameras, millimeter-wave radars, and composite navigation modules.
  • Figure 4: Red-light violation case before and after DTCCL learning. Left: The autonomous bus is passing an intersection with red light detected, the initial policy incorrectly assigns a forward-driving trajectory, leading to disengagement. Right: after DTCCL training, the policy generates a stop trajectory before the intersection, preventing violation.
  • Figure 5: Left: While overtaking a slow-moving vehicle, the initial policy yields an overly conservative leftward deviation, bringing the autonomous bus dangerously close to the roadside guardrail and triggering a disengagement. Right: After DTCCL learning, the updated policy generates a stable, collision-free trajectory that safely avoids the lead vehicle without encroaching on the guardrail.