Table of Contents
Fetching ...

WoodScape Motion Segmentation for Autonomous Driving -- CVPR 2023 OmniCV Workshop Challenge

Saravanabalagi Ramachandran, Nathaniel Cibik, Ganesh Sistu, John McDonald

TL;DR

This work presents the WoodScape fisheye motion segmentation challenge at CVPR OmniCV 2023, combining WoodScape real data with the PD-WoodScape synthetic dataset to study cross-domain learning under fisheye surround-view conditions. It analyzes datasets, evaluation metrics, and reward structures, and reports baseline results that reveal a notable synthetic-to-real domain gap. The top approaches employ transformer-based architectures (Mask2Former with Swin backbones), model ensembles, and domain adaptation techniques to achieve robust motion segmentation, with Team STAR attaining the highest score. The study provides practical insights into the effectiveness of synthetic data, domain adaptation, phased training, and ensembling, establishing benchmarks and guiding future work in fisheye motion segmentation for autonomous driving.

Abstract

Motion segmentation is a complex yet indispensable task in autonomous driving. The challenges introduced by the ego-motion of the cameras, radial distortion in fisheye lenses, and the need for temporal consistency make the task more complicated, rendering traditional and standard Convolutional Neural Network (CNN) approaches less effective. The consequent laborious data labeling, representation of diverse and uncommon scenarios, and extensive data capture requirements underscore the imperative of synthetic data for improving machine learning model performance. To this end, we employ the PD-WoodScape synthetic dataset developed by Parallel Domain, alongside the WoodScape fisheye dataset. Thus, we present the WoodScape fisheye motion segmentation challenge for autonomous driving, held as part of the CVPR 2023 Workshop on Omnidirectional Computer Vision (OmniCV). As one of the first competitions focused on fisheye motion segmentation, we aim to explore and evaluate the potential and impact of utilizing synthetic data in this domain. In this paper, we provide a detailed analysis on the competition which attracted the participation of 112 global teams and a total of 234 submissions. This study delineates the complexities inherent in the task of motion segmentation, emphasizes the significance of fisheye datasets, articulate the necessity for synthetic datasets and the resultant domain gap they engender, outlining the foundational blueprint for devising successful solutions. Subsequently, we delve into the details of the baseline experiments and winning methods evaluating their qualitative and quantitative results, providing with useful insights.

WoodScape Motion Segmentation for Autonomous Driving -- CVPR 2023 OmniCV Workshop Challenge

TL;DR

This work presents the WoodScape fisheye motion segmentation challenge at CVPR OmniCV 2023, combining WoodScape real data with the PD-WoodScape synthetic dataset to study cross-domain learning under fisheye surround-view conditions. It analyzes datasets, evaluation metrics, and reward structures, and reports baseline results that reveal a notable synthetic-to-real domain gap. The top approaches employ transformer-based architectures (Mask2Former with Swin backbones), model ensembles, and domain adaptation techniques to achieve robust motion segmentation, with Team STAR attaining the highest score. The study provides practical insights into the effectiveness of synthetic data, domain adaptation, phased training, and ensembling, establishing benchmarks and guiding future work in fisheye motion segmentation for autonomous driving.

Abstract

Motion segmentation is a complex yet indispensable task in autonomous driving. The challenges introduced by the ego-motion of the cameras, radial distortion in fisheye lenses, and the need for temporal consistency make the task more complicated, rendering traditional and standard Convolutional Neural Network (CNN) approaches less effective. The consequent laborious data labeling, representation of diverse and uncommon scenarios, and extensive data capture requirements underscore the imperative of synthetic data for improving machine learning model performance. To this end, we employ the PD-WoodScape synthetic dataset developed by Parallel Domain, alongside the WoodScape fisheye dataset. Thus, we present the WoodScape fisheye motion segmentation challenge for autonomous driving, held as part of the CVPR 2023 Workshop on Omnidirectional Computer Vision (OmniCV). As one of the first competitions focused on fisheye motion segmentation, we aim to explore and evaluate the potential and impact of utilizing synthetic data in this domain. In this paper, we provide a detailed analysis on the competition which attracted the participation of 112 global teams and a total of 234 submissions. This study delineates the complexities inherent in the task of motion segmentation, emphasizes the significance of fisheye datasets, articulate the necessity for synthetic datasets and the resultant domain gap they engender, outlining the foundational blueprint for devising successful solutions. Subsequently, we delve into the details of the baseline experiments and winning methods evaluating their qualitative and quantitative results, providing with useful insights.
Paper Structure (14 sections, 1 equation, 5 figures, 5 tables)

This paper contains 14 sections, 1 equation, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Illustration of a typical automotive surround-view system consisting of four fisheye cameras located at the front, rear, and on each wing mirror covering the entire $360^\circ$ around the vehicle. Left: Field of view for each of the four cameras with overlaps. Right: An example frame with images from all the cameras from WoodScape dataset.
  • Figure 2: Illustration of various perception tasks in WoodScape dataset.
  • Figure 3: Illustration of motion segmentation annotations. Note that while both the real and synthetic dataset offer ground truth with multiple classes, the task for the challenge is a binary classification between static and motion pixels. (Top row: real images from WoodScape dataset. Bottom row: synthetic images from PD-WoodScape dataset)
  • Figure 4: Illustration of the trend of number of daily submissions and their scores during the entire phase of the competition.
  • Figure 5: Motion segmentation predictions (overlaid on the input image and highlighted in green) by top 3 teams compared with the reference for 3 randomly chosen images. Left to Right: Reference (ground truth labels), Team STAR (winner), Team USTC-IAT-United (second place) and Team XMU-UAV (third place).