Table of Contents
Fetching ...

FisheyeDetNet: 360° Surround view Fisheye Camera based Object Detection System for Autonomous Driving

Ganesh Sistu, Senthil Yogamani

TL;DR

The paper tackles object detection in fisheye surround-view cameras where radial distortion degrades standard bounding-box representations. It introduces and evaluates rotated bounding boxes, ellipses, and, most effectively, polygon-based representations implemented in FisheyeDetNet, a lightweight ResNet-18–based detector with a YOLO-style head. The key contribution is the formal demonstration that learning right geometric representations—particularly dense polygons in polar space—substantially improves detection accuracy (achieving about $49.5\%$ mAP on Valeo data) and downstream parking-slot perception, without resorting to heavy models. This approach enables robust near-field perception for autonomous driving on low-power hardware, highlighting the practical impact of representation choice in fisheye perception systems.

Abstract

Object detection is a mature problem in autonomous driving with pedestrian detection being one of the first deployed algorithms. It has been comprehensively studied in the literature. However, object detection is relatively less explored for fisheye cameras used for surround-view near field sensing. The standard bounding box representation fails in fisheye cameras due to heavy radial distortion, particularly in the periphery. To mitigate this, we explore extending the standard object detection output representation of bounding box. We design rotated bounding boxes, ellipse, generic polygon as polar arc/angle representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model FisheyeDetNet with polygon outperforms others and achieves a mAP score of 49.5 % on Valeo fisheye surround-view dataset for automated driving applications. This dataset has 60K images captured from 4 surround-view cameras across Europe, North America and Asia. To the best of our knowledge, this is the first detailed study on object detection on fisheye cameras for autonomous driving scenarios.

FisheyeDetNet: 360° Surround view Fisheye Camera based Object Detection System for Autonomous Driving

TL;DR

The paper tackles object detection in fisheye surround-view cameras where radial distortion degrades standard bounding-box representations. It introduces and evaluates rotated bounding boxes, ellipses, and, most effectively, polygon-based representations implemented in FisheyeDetNet, a lightweight ResNet-18–based detector with a YOLO-style head. The key contribution is the formal demonstration that learning right geometric representations—particularly dense polygons in polar space—substantially improves detection accuracy (achieving about mAP on Valeo data) and downstream parking-slot perception, without resorting to heavy models. This approach enables robust near-field perception for autonomous driving on low-power hardware, highlighting the practical impact of representation choice in fisheye perception systems.

Abstract

Object detection is a mature problem in autonomous driving with pedestrian detection being one of the first deployed algorithms. It has been comprehensively studied in the literature. However, object detection is relatively less explored for fisheye cameras used for surround-view near field sensing. The standard bounding box representation fails in fisheye cameras due to heavy radial distortion, particularly in the periphery. To mitigate this, we explore extending the standard object detection output representation of bounding box. We design rotated bounding boxes, ellipse, generic polygon as polar arc/angle representations and define an instance segmentation mIOU metric to analyze these representations. The proposed model FisheyeDetNet with polygon outperforms others and achieves a mAP score of 49.5 % on Valeo fisheye surround-view dataset for automated driving applications. This dataset has 60K images captured from 4 surround-view cameras across Europe, North America and Asia. To the best of our knowledge, this is the first detailed study on object detection on fisheye cameras for autonomous driving scenarios.
Paper Structure (23 sections, 5 equations, 10 figures, 2 tables)

This paper contains 23 sections, 5 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Surround-view camera network images showing near field sensing and wide field of view
  • Figure 2: Autonomous Driving Pipeline
  • Figure 3: Center: Front camera image. Right(B): Bounding boxes representing objects correctly. Left(A): Bounding boxes and oriented boxes fail to represent objects accurately, more details in Section \ref{['Intro']}
  • Figure 4: Comparison between MaskRCNN and Multi-task Network Cascade. Both models are two stage approaches and use FasterRCNN components (blocks not colored)
  • Figure 5: Undistorting the fisheye image: (a) Rectilinear correction; (b) Piecewise linear correction; (c) Cylindrical correction. Left: raw image; Right: undistorted image.
  • ...and 5 more figures