Table of Contents
Fetching ...

3DArticCyclists: Generating Synthetic Articulated 8D Pose-Controllable Cyclist Data for Computer Vision Applications

Eduardo R. Corral-Soto, Yang Liu, Tongtong Cao, Yuan Ren, Liu Bingbing

TL;DR

This work tackles the scarcity of diverse cyclist data for autonomous driving perception by introducing a pipeline to generate synthetic, dynamic, articulated 3D cyclists. The approach combines a new articulated bicycle dataset (3DArticBikes), a 3D Gaussian Splatting based parametric bicycle model, and a rider-on-bike assembly via an inverse kinematics pose refinement grounded in SMPL/RenderPeople data. Key contributions include the 3DArticBikes dataset, a controllable 8-DoF bicycle model, and a rider refinement procedure that aligns rider joints to bike keypoints, enabling realistic animated cyclists for training and evaluation. The authors demonstrate perceptual quality gains over a stable-diffusion baseline and show practical gains in bicycle semantic segmentation, underscoring the method’s potential for advancing spatio-temporal analysis and pose estimation in complex human-object interactions within autonomous driving contexts.

Abstract

In Autonomous Driving (AD) Perception, cyclists are considered safety-critical scene objects. Commonly used publicly-available AD datasets typically contain large amounts of car and vehicle object instances but a low number of cyclist instances, usually with limited appearance and pose diversity. This cyclist training data scarcity problem not only limits the generalization of deep-learning perception models for cyclist semantic segmentation, pose estimation, and cyclist crossing intention prediction, but also limits research on new cyclist-related tasks such as fine-grained cyclist pose estimation and spatio-temporal analysis under complex interactions between humans and articulated objects. To address this data scarcity problem, in this paper we propose a framework to generate synthetic dynamic 3D cyclist data assets that can be used to generate training data for different tasks. In our framework, we designed a methodology for creating a new part-based multi-view articulated synthetic 3D bicycle dataset that we call 3DArticBikes that we use to train a 3D Gaussian Splatting (3DGS)-based reconstruction and image rendering method. We then propose a parametric bicycle 3DGS composition model to assemble 8-DoF pose-controllable 3D bicycles. Finally, using dynamic information from cyclist videos, we build a complete synthetic dynamic 3D cyclist (rider pedaling a bicycle) by re-posing a selectable synthetic 3D person, while automatically placing the rider onto one of our new articulated 3D bicycles using a proposed 3D Keypoint optimization-based Inverse Kinematics pose refinement. We present both, qualitative and quantitative results where we compare our generated cyclists against those from a recent stable diffusion-based method.

3DArticCyclists: Generating Synthetic Articulated 8D Pose-Controllable Cyclist Data for Computer Vision Applications

TL;DR

This work tackles the scarcity of diverse cyclist data for autonomous driving perception by introducing a pipeline to generate synthetic, dynamic, articulated 3D cyclists. The approach combines a new articulated bicycle dataset (3DArticBikes), a 3D Gaussian Splatting based parametric bicycle model, and a rider-on-bike assembly via an inverse kinematics pose refinement grounded in SMPL/RenderPeople data. Key contributions include the 3DArticBikes dataset, a controllable 8-DoF bicycle model, and a rider refinement procedure that aligns rider joints to bike keypoints, enabling realistic animated cyclists for training and evaluation. The authors demonstrate perceptual quality gains over a stable-diffusion baseline and show practical gains in bicycle semantic segmentation, underscoring the method’s potential for advancing spatio-temporal analysis and pose estimation in complex human-object interactions within autonomous driving contexts.

Abstract

In Autonomous Driving (AD) Perception, cyclists are considered safety-critical scene objects. Commonly used publicly-available AD datasets typically contain large amounts of car and vehicle object instances but a low number of cyclist instances, usually with limited appearance and pose diversity. This cyclist training data scarcity problem not only limits the generalization of deep-learning perception models for cyclist semantic segmentation, pose estimation, and cyclist crossing intention prediction, but also limits research on new cyclist-related tasks such as fine-grained cyclist pose estimation and spatio-temporal analysis under complex interactions between humans and articulated objects. To address this data scarcity problem, in this paper we propose a framework to generate synthetic dynamic 3D cyclist data assets that can be used to generate training data for different tasks. In our framework, we designed a methodology for creating a new part-based multi-view articulated synthetic 3D bicycle dataset that we call 3DArticBikes that we use to train a 3D Gaussian Splatting (3DGS)-based reconstruction and image rendering method. We then propose a parametric bicycle 3DGS composition model to assemble 8-DoF pose-controllable 3D bicycles. Finally, using dynamic information from cyclist videos, we build a complete synthetic dynamic 3D cyclist (rider pedaling a bicycle) by re-posing a selectable synthetic 3D person, while automatically placing the rider onto one of our new articulated 3D bicycles using a proposed 3D Keypoint optimization-based Inverse Kinematics pose refinement. We present both, qualitative and quantitative results where we compare our generated cyclists against those from a recent stable diffusion-based method.

Paper Structure

This paper contains 14 sections, 5 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Proposed Pipeline to Generate Dynamic Synthetic 3D Cyclists.
  • Figure 2: (a) Generating Raw 3D bicycle parts dataset in Blender for our articulated dynamic 3D cyclist generation project. (b) 3DGS bicycle steering and pedal angle manipulation. The bicycle steering and pedals Gaussian attributes are rotated by the desired angles $\theta_s$ and $\theta_p$ respectively, using $4 \times 4$ 3D rotation matrices. We then compose the outputted re-posed 3DGS bicycle by concatenating the parts Gaussians.
  • Figure 3: (a) Deriving the left pedal angle from the rider's ankle 3D Keypoints. (b) Deriving the steering angle from the rider's wrist 3D Keypoints.
  • Figure 4: Example qualitative results. (a) Different 3D persons from RenderPeople riding bicycles from our 3DArticBikes dataset, (b) Typical failure cases: Seat too high for some riders, gaze/head pose looking down, rider not properly seated. (c) We tested the generalization our pipeline with scooters and motorcycles. (d) Cyclist generation with animation from cyclist videos. We compared our method against the Zero-1-to-3 method. (e) Real COCO and Waymo images used to compute FID/KID metrics from Table \ref{['eval_table_fid_kid']}.
  • Figure 5: Rider Pose Refinement Inverse Kinematics Optimization Iterations.