Table of Contents
Fetching ...

ReinDriveGen: Reinforcement Post-Training for Out-of-Distribution Driving Scene Generation

Hao Zhang, Lue Fan, Weikang Bian, Zehuan Wu, Lewei Lu, Zhaoxiang Zhang, Hongsheng Li

Abstract

We present ReinDriveGen, a framework that enables full controllability over dynamic driving scenes, allowing users to freely edit actor trajectories to simulate safety-critical corner cases such as front-vehicle collisions, drifting cars, vehicles spinning out of control, pedestrians jaywalking, and cyclists cutting across lanes. Our approach constructs a dynamic 3D point cloud scene from multi-frame LiDAR data, introduces a vehicle completion module to reconstruct full 360° geometry from partial observations, and renders the edited scene into 2D condition images that guide a video diffusion model to synthesize realistic driving videos. Since such edited scenarios inevitably fall outside the training distribution, we further propose an RL-based post-training strategy with a pairwise preference model and a pairwise reward mechanism, enabling robust quality improvement under out-of-distribution conditions without ground-truth supervision. Extensive experiments demonstrate that ReinDriveGen outperforms existing approaches on edited driving scenarios and achieves state-of-the-art results on novel ego viewpoint synthesis.

ReinDriveGen: Reinforcement Post-Training for Out-of-Distribution Driving Scene Generation

Abstract

We present ReinDriveGen, a framework that enables full controllability over dynamic driving scenes, allowing users to freely edit actor trajectories to simulate safety-critical corner cases such as front-vehicle collisions, drifting cars, vehicles spinning out of control, pedestrians jaywalking, and cyclists cutting across lanes. Our approach constructs a dynamic 3D point cloud scene from multi-frame LiDAR data, introduces a vehicle completion module to reconstruct full 360° geometry from partial observations, and renders the edited scene into 2D condition images that guide a video diffusion model to synthesize realistic driving videos. Since such edited scenarios inevitably fall outside the training distribution, we further propose an RL-based post-training strategy with a pairwise preference model and a pairwise reward mechanism, enabling robust quality improvement under out-of-distribution conditions without ground-truth supervision. Extensive experiments demonstrate that ReinDriveGen outperforms existing approaches on edited driving scenarios and achieves state-of-the-art results on novel ego viewpoint synthesis.

Paper Structure

This paper contains 32 sections, 5 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: ReinDriveGen enables photorealistic generation of OOD driving scenarios. ReinDriveGen can manipulate actor trajectories to synthesize diverse safety-critical corner cases such as vehicle in-place spinning, cyclist crossing, and left-turn collisions. Our RL-based post-training significantly improves generation quality for these OOD edits without requiring ground-truth supervision.
  • Figure 2: Overview of ReinDriveGen's two core components.Left: Our simulator edits dynamic point clouds and completes vehicle geometry to render structural pseudo-images, which condition the video diffusion model for photorealistic synthesis. Right: To enhance quality in out-of-distribution (OOD) scenarios, we employ RL post-training. A pairwise reward mechanism ranks generated candidates into positive and negative sets, providing robust contrastive supervision.
  • Figure 3: Qualitative demonstrations of ReinDriveGen on OOD safety-critical corner cases. Every two consecutive rows show two frames from the same scene, with the edit type and edited region annotated in each image.
  • Figure 4: Qualitative comparison of novel trajectories in the lane-change scenario.
  • Figure 5: Qualitative comparison of vehicle trajectory editing. Top two rows: the target vehicle is shifted 6m left and 4m backward. Bottom two rows: the target vehicle is shifted 6m to the right and 1m backward.
  • ...and 5 more figures