A Faster and More Reliable Middleware for Autonomous Driving Systems
Yuankai He, Weisong Shi
TL;DR
The paper tackles the critical problem of perception-to-decision latency in high-speed autonomous vehicles by addressing intra-host messaging bottlenecks. It introduces Sensor-in-Memory (SIM), a native-layout, shared-memory transport with lock-free, double-buffered data planes that bypasses (de)serialization while maintaining ROS 2 compatibility. Across Jetson Orin Nano and a production vehicle, SIM achieves substantial reductions in transport latency and tail latency (up to ~98% max and ~95% mean improvements) and improves application throughput in Autoware.Universe (e.g., localization frequency rising from 7.5 Hz to 9.5 Hz) with a notable end-to-end latency drop from about 522 ms to 290 ms. The results demonstrate that SIM can significantly enhance perception-to-control responsiveness and safety margins, while preserving the open-source ROS 2 ecosystem and enabling incremental deployment along intra-host paths.
Abstract
Ensuring safety in high-speed autonomous vehicles requires rapid control loops and tightly bounded delays from perception to actuation. Many open-source autonomy systems rely on ROS 2 middleware; when multiple sensor and control nodes share one compute unit, ROS 2 and its DDS transports add significant (de)serialization, copying, and discovery overheads, shrinking the available time budget. We present Sensor-in-Memory (SIM), a shared-memory transport designed for intra-host pipelines in autonomous vehicles. SIM keeps sensor data in native memory layouts (e.g., cv::Mat, PCL), uses lock-free bounded double buffers that overwrite old data to prioritize freshness, and integrates into ROS 2 nodes with four lines of code. Unlike traditional middleware, SIM operates beside ROS 2 and is optimized for applications where data freshness and minimal latency outweigh guaranteed completeness. SIM provides sequence numbers, a writer heartbeat, and optional checksums to ensure ordering, liveness, and basic integrity. On an NVIDIA Jetson Orin Nano, SIM reduces data-transport latency by up to 98% compared to ROS 2 zero-copy transports such as FastRTPS and Zenoh, lowers mean latency by about 95%, and narrows 95th/99th-percentile tail latencies by around 96%. In tests on a production-ready Level 4 vehicle running Autoware.Universe, SIM increased localization frequency from 7.5 Hz to 9.5 Hz. Applied across all latency-critical modules, SIM cut average perception-to-decision latency from 521.91 ms to 290.26 ms, reducing emergency braking distance at 40 mph (64 km/h) on dry concrete by 13.6 ft (4.14 m).
