Table of Contents
Fetching ...

DeepIPC: Deeply Integrated Perception and Control for an Autonomous Vehicle in Real Environments

Oskar Natan, Jun Miura

TL;DR

DeepIPC addresses the challenge of integrating perception and control for autonomous driving in real environments by combining an RGBD-based perception front end with a BEV semantic map and a dual-branch controller that predicts three future waypoints and controls via a PID-MLP fusion. The model is trained with a multi-task imitation learning objective and evaluated both offline and online against baselines, demonstrating superior drivability and multi-task efficiency while maintaining a lean architecture. Key contributions include widening the perception ROI, incorporating wheel-speed inputs, using two route points for robustness, and introducing a robust agent-takeover policy, all evaluated with a novel drivability metric. The work suggests that end-to-end perception-control systems with BEV representations can offer practical improvements for real-world autonomous navigation and lays groundwork for further enhancements with additional sensing like LiDAR.

Abstract

In this work, we introduce DeepIPC, a novel end-to-end model tailored for autonomous driving, which seamlessly integrates perception and control tasks. Unlike traditional models that handle these tasks separately, DeepIPC innovatively combines a perception module, which processes RGBD images for semantic segmentation and generates bird's eye view (BEV) mappings, with a controller module that utilizes these insights along with GNSS and angular speed measurements to accurately predict navigational waypoints. This integration allows DeepIPC to efficiently translate complex environmental data into actionable driving commands. Our comprehensive evaluation demonstrates DeepIPC's superior performance in terms of drivability and multi-task efficiency across diverse real-world scenarios, setting a new benchmark for end-to-end autonomous driving systems with a leaner model architecture. The experimental results underscore DeepIPC's potential to significantly enhance autonomous vehicular navigation, promising a step forward in the development of autonomous driving technologies. For further insights and replication, we will make our code and datasets available at https://github.com/oskarnatan/DeepIPC.

DeepIPC: Deeply Integrated Perception and Control for an Autonomous Vehicle in Real Environments

TL;DR

DeepIPC addresses the challenge of integrating perception and control for autonomous driving in real environments by combining an RGBD-based perception front end with a BEV semantic map and a dual-branch controller that predicts three future waypoints and controls via a PID-MLP fusion. The model is trained with a multi-task imitation learning objective and evaluated both offline and online against baselines, demonstrating superior drivability and multi-task efficiency while maintaining a lean architecture. Key contributions include widening the perception ROI, incorporating wheel-speed inputs, using two route points for robustness, and introducing a robust agent-takeover policy, all evaluated with a novel drivability metric. The work suggests that end-to-end perception-control systems with BEV representations can offer practical improvements for real-world autonomous navigation and lays groundwork for further enhancements with additional sensing like LiDAR.

Abstract

In this work, we introduce DeepIPC, a novel end-to-end model tailored for autonomous driving, which seamlessly integrates perception and control tasks. Unlike traditional models that handle these tasks separately, DeepIPC innovatively combines a perception module, which processes RGBD images for semantic segmentation and generates bird's eye view (BEV) mappings, with a controller module that utilizes these insights along with GNSS and angular speed measurements to accurately predict navigational waypoints. This integration allows DeepIPC to efficiently translate complex environmental data into actionable driving commands. Our comprehensive evaluation demonstrates DeepIPC's superior performance in terms of drivability and multi-task efficiency across diverse real-world scenarios, setting a new benchmark for end-to-end autonomous driving systems with a leaner model architecture. The experimental results underscore DeepIPC's potential to significantly enhance autonomous vehicular navigation, promising a step forward in the development of autonomous driving technologies. For further insights and replication, we will make our code and datasets available at https://github.com/oskarnatan/DeepIPC.
Paper Structure (15 sections, 10 equations, 5 figures, 4 tables, 2 algorithms)

This paper contains 15 sections, 10 equations, 5 figures, 4 tables, 2 algorithms.

Figures (5)

  • Figure 1: DeepIPC perceives the environment by performing image segmentation and BEV semantic mapping. Simultaneously, it also estimates waypoints (white dots) and controls to drive the vehicle by following a set of route points (white hollow circles). The detailed architecture of DeepIPC can be seen in Fig. \ref{['fig:model']}.
  • Figure 2: The architecture of DeepIPC. Blue blocks are parts of the perception module, while green blocks are parts of the controller module. Light-colored blocks are not trainable, while the darker ones are trainable. In the BEV semantic map, waypoints are denoted with white dots, while route points are denoted with white circles. Waypoints are points predicted by the DeepIPC (controller module) based on the features extracted by the perception module. These points are then translated into steering and throttle commands by two PID controllers to navigate the vehicle. Meanwhile, route points refer to points in global latitude-longitude coordinates that inform DeepIPC about the path for navigating the vehicle from the starting point to the destination. Additionally, route points can be generated with the assistance of applications such as Google Maps, providing a set of latitude-longitude coordinates that delineate the path from the starting location to the destination.
  • Figure 3: The experiment area. White hollow circles represent a route that consists of start, finish, and a set of route points. (https://goo.gl/maps/9rXobdhP3VYdjXn48)
  • Figure 4: Sensor placement on a robotic vehicle. The rotary encoder is mounted inside each rear wheel.
  • Figure 5: Driving footage. See the driving video (playback speed 5$\times$) at https://youtu.be/AiKotQ-lAzw for more details, including failure cases where we intervene in the model to avoid collisions. Sunny noon: DeepIPC makes a small steering adjustment to the right as the vehicle is too close to the terrain. Cloudy noon: Although DeepIPC cannot segment the car properly, it can avoid collision as it knows that the left side is occupied. Sunset evening: DeepIPC makes a small steering adjustment to keep on its lane. Low light evening: We intervene in DeepIPC to avoid driving off-road on the vegetation as it keeps the throttle maximum and fails to make a right turn.