Table of Contents
Fetching ...

PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Adaptive Autonomous Driving

Yi Zhang, Xian Zhang, Saisi Zhao, Yinglei Song, Chengdong Wu, Nenad Petrovic, Alois Knoll

TL;DR

PRAM-R is presented, a unified Perception-Reasoning-Action-Memory framework with LLM-Guided Modality Routing for adaptive autonomous driving that achieves efficient, adaptive multimodal perception in autonomous driving.

Abstract

Multimodal perception enables robust autonomous driving but incurs unnecessary computational cost when all sensors remain active. This paper presents PRAM-R, a unified Perception-Reasoning-Action-Memory framework with LLM-Guided Modality Routing for adaptive autonomous driving. PRAM-R adopts an asynchronous dual-loop design: a fast reactive loop for perception and control, and a slow deliberative loop for reasoning-driven modality selection and memory updates. An LLM router selects and weights modalities using environmental context and sensor diagnostics, while a hierarchical memory module preserves temporal consistency and supports long-term adaptation. We conduct a two-stage evaluation: (1) synthetic stress tests for stability analysis and (2) real-world validation on the nuScenes dataset. Synthetic stress tests confirm 87.2% reduction in routing oscillations via hysteresis-based stabilization. Real-world validation on nuScenes shows 6.22% modality reduction with 20% memory recall while maintaining comparable trajectory accuracy to full-modality baselines in complex urban scenarios. Our work demonstrates that LLM-augmented architectures with hierarchical memory achieve efficient, adaptive multimodal perception in autonomous driving.

PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Adaptive Autonomous Driving

TL;DR

PRAM-R is presented, a unified Perception-Reasoning-Action-Memory framework with LLM-Guided Modality Routing for adaptive autonomous driving that achieves efficient, adaptive multimodal perception in autonomous driving.

Abstract

Multimodal perception enables robust autonomous driving but incurs unnecessary computational cost when all sensors remain active. This paper presents PRAM-R, a unified Perception-Reasoning-Action-Memory framework with LLM-Guided Modality Routing for adaptive autonomous driving. PRAM-R adopts an asynchronous dual-loop design: a fast reactive loop for perception and control, and a slow deliberative loop for reasoning-driven modality selection and memory updates. An LLM router selects and weights modalities using environmental context and sensor diagnostics, while a hierarchical memory module preserves temporal consistency and supports long-term adaptation. We conduct a two-stage evaluation: (1) synthetic stress tests for stability analysis and (2) real-world validation on the nuScenes dataset. Synthetic stress tests confirm 87.2% reduction in routing oscillations via hysteresis-based stabilization. Real-world validation on nuScenes shows 6.22% modality reduction with 20% memory recall while maintaining comparable trajectory accuracy to full-modality baselines in complex urban scenarios. Our work demonstrates that LLM-augmented architectures with hierarchical memory achieve efficient, adaptive multimodal perception in autonomous driving.
Paper Structure (24 sections, 9 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 24 sections, 9 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: Perception-Reasoning-Action-Memory Framework with modality routing for complex scene interpretation and motion control in autonomous driving: (1) Perception-layer, (2) Language-layer, (3) Action-layer, and (4) Memory-Layer.
  • Figure 2: Hierarchical and sequential organization of PRAM-R memory components and their temporal scales.
  • Figure 3: Threshold-Fluctuation Stress Test. Comparison of routing behavior under high-frequency perturbations. (a) Weights before and after EMA and hysteresis gating across timestep. (b) Corresponding binary activation states.
  • Figure 4: Combined Memory Recall Rate. Comparison of short-term (rolling) and long-term (cumulative) recall rates for two different sampling frequencies.
  • Figure 5: Comprehensive routing statistics across nuScenes. (a) Modality activation rates demonstrate adaptive engagement. (b) Mean weight trajectories illustrate stable yet flexible fusion. (c) Reliability distributions differs from each modality.