Safe Urban Traffic Control via Uncertainty-Aware Conformal Prediction and World-Model Reinforcement Learning

Joydeep Chandra; Satyam Kumar Navneet; Aleksandr Algazinov; Yong Zhang

Safe Urban Traffic Control via Uncertainty-Aware Conformal Prediction and World-Model Reinforcement Learning

Joydeep Chandra, Satyam Kumar Navneet, Aleksandr Algazinov, Yong Zhang

TL;DR

The paper addresses safe urban traffic control by propagating calibrated uncertainty across forecasting, anomaly detection, and reinforcement learning. It introduces PU-GAT+ for uncertainty-guided attention, CRFN-BY for dependence-robust anomaly detection with conformal p-values, and LyCon-WRL+ for Lyapunov-certified safe RL with Lipschitz bounds from spectral normalization. The framework achieves distribution-free coverage (≈91.4%), FDR control under dependence (≈4.1%), and a safety-enhanced RL performance (≈95.2% safe episodes) with real-time inference (~23 ms), demonstrating that reliability guarantees can be maintained without sacrificing performance. These results suggest significant practical impact for deploying robust, safe ML-guided traffic control in urban environments, while also outlining limitations such as BY conservatism and scalability challenges that worth addressing in future work.

Abstract

Urban traffic management demands systems that simultaneously predict future conditions, detect anomalies, and take safe corrective actions -- all while providing reliability guarantees. We present STREAM-RL, a unified framework that introduces three novel algorithmic contributions: (1) PU-GAT+, an Uncertainty-Guided Adaptive Conformal Forecaster that uses prediction uncertainty to dynamically reweight graph attention via confidence-monotonic attention, achieving distribution-free coverage guarantees; (2) CRFN-BY, a Conformal Residual Flow Network that models uncertainty-normalized residuals via normalizing flows with Benjamini-Yekutieli FDR control under arbitrary dependence; and (3) LyCon-WRL+, an Uncertainty-Guided Safe World-Model RL agent with Lyapunov stability certificates, certified Lipschitz bounds, and uncertainty-propagated imagination rollouts. To our knowledge, this is the first framework to propagate calibrated uncertainty from forecasting through anomaly detection to safe policy learning with end-to-end theoretical guarantees. Experiments on multiple real-world traffic trajectory data demonstrate that STREAM-RL achieves 91.4\% coverage efficiency, controls FDR at 4.1\% under verified dependence, and improves safety rate to 95.2\% compared to 69\% for standard PPO while achieving higher reward, with 23ms end-to-end inference latency.

Safe Urban Traffic Control via Uncertainty-Aware Conformal Prediction and World-Model Reinforcement Learning

TL;DR

Abstract

Paper Structure (64 sections, 8 theorems, 71 equations, 13 figures, 18 tables, 3 algorithms)

This paper contains 64 sections, 8 theorems, 71 equations, 13 figures, 18 tables, 3 algorithms.

Introduction
Related Work
Problem Formulation
Methodology
PU-GAT$^+$: Confidence-Monotonic Attention
Motivation: Why Temperature Scaling Fails
Pairwise Uncertainty Mechanism with Monotonicity Constraint
Uncertainty Propagation Across Layers
Dual-Stream Temporal Decomposition
Spatially-Adaptive Conformal Calibration
CRFN-BY: Dependence-Robust Anomaly Detection
Uncertainty-Normalized Residuals
Context-Conditioned Normalizing Flow
Conformalized P-Value Construction
Contamination-Robust Calibration
...and 49 more sections

Key Result

Proposition 4.1

For temperature-scaled attention (Eq. eq:temp_scaling), the relative attention ratio between any two neighbors $j, k \in \mathcal{N}(i)$ is independent of their uncertainties: $\alpha_{ij}^{\text{temp}}/\alpha_{ik}^{\text{temp}} = \exp(e_{ij})/\exp(e_{ik})$.

Figures (13)

Figure 1: Architecture of STREAM-RL. The framework integrates: (1) PU-GAT+ for uncertainty-aware forecasting, (2) CRFN-BY for dependence-robust anomaly detection, and (3) LyCon-WRL+ for certified safe RL. Dashed lines show cross-module uncertainty propagation.
Figure 2: Calibration diagnostics: (a) PIT Histogram and (b) Reliability Diagram.
Figure 3: RL learning curves with/without upstream uncertainty. Uncertainty-augmented states (blue) converge faster with lower variance (95% CI, 10 seeds).
Figure 4: Comprehensive forecasting results comparison across baseline methods. PU-GAT$^+$ (highlighted in gold) achieves best performance on all metrics: NRMSE (0.254), MAE (1498), Coverage (91.4%), RIW (0.43), and Coverage Efficiency (2.13). RIPCN shows under-coverage violation (89.8% < 90% target, highlighted in red). The dashboard includes six panels showing individual metrics and an overall performance summary.
Figure 5: Anomaly detection performance with FDR control analysis. Top-left: Precision, Recall, and F1 scores on synthetic anomalies. Top-right: FDR control on synthetic data with 5% threshold line. Bottom-left: Recall on real documented events. Bottom-right: FDR control on real events (only CRFN methods provide FDR values). Bottom panel: Key findings summary. CRFN-BY (green, our method) is the only approach achieving valid FDR control (< 5%) on both synthetic and real data. CRFN+BH (red) violates the FDR target.
...and 8 more figures

Theorems & Definitions (22)

Proposition 4.1: Temperature Scaling Limitation
proof
Definition 4.2: PU-GAT$^+$ Attention
Proposition 4.3: Confidence-Monotonic Attention
Remark 4.4: Interpretation
Theorem 4.5: Coverage Guarantee under Mixing
Definition 4.6: Conformalized P-Values
Lemma 4.7: Trimming Preserves Super-Uniformity
Theorem 4.8: FDR Control under Arbitrary Dependence
Definition 4.9: Traffic Safety Constraints
...and 12 more

Safe Urban Traffic Control via Uncertainty-Aware Conformal Prediction and World-Model Reinforcement Learning

TL;DR

Abstract

Safe Urban Traffic Control via Uncertainty-Aware Conformal Prediction and World-Model Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (22)