Table of Contents
Fetching ...

Reinforcement Learning Enhancement Using Vector Semantic Representation and Symbolic Reasoning for Human-Centered Autonomous Emergency Braking

Vinal Asodia, Iman Sharifi, Saber Fallah

TL;DR

This work tackles the limitations of camera-based reinforcement learning in autonomous driving by integrating a holistic, neuro-symbolic feature representation with a soft symbolic reward. It introduces Vector Semantic Representation (VSR) to encode semantic, spatial, and shape information of dynamic scene elements, and a Soft First-Order Logic (SFOL) reward to balance safety, efficiency, and comfort through symbolic reasoning. Empirical evaluation in CARLA under occluded-pedestrian scenarios demonstrates that fusing VSR with SFOL—especially via additive integration—yields more robust policies with higher collision avoidance and stable control across varying traffic densities. The approach advances human-centered, context-aware decision-making in autonomous driving by coupling structured scene understanding with value-aligned reward shaping, paving the way for safer real-world deployment.

Abstract

The problem with existing camera-based Deep Reinforcement Learning approaches is twofold: they rarely integrate high-level scene context into the feature representation, and they rely on rigid, fixed reward functions. To address these challenges, this paper proposes a novel pipeline that produces a neuro-symbolic feature representation that encompasses semantic, spatial, and shape information, as well as spatially boosted features of dynamic entities in the scene, with an emphasis on safety-critical road users. It also proposes a Soft First-Order Logic (SFOL) reward function that balances human values via a symbolic reasoning module. Here, semantic and spatial predicates are extracted from segmentation maps and applied to linguistic rules to obtain reward weights. Quantitative experiments in the CARLA simulation environment show that the proposed neuro-symbolic representation and SFOL reward function improved policy robustness and safety-related performance metrics compared to baseline representations and reward formulations across varying traffic densities and occlusion levels. The findings demonstrate that integrating holistic representations and soft reasoning into Reinforcement Learning can support more context-aware and value-aligned decision-making for autonomous driving.

Reinforcement Learning Enhancement Using Vector Semantic Representation and Symbolic Reasoning for Human-Centered Autonomous Emergency Braking

TL;DR

This work tackles the limitations of camera-based reinforcement learning in autonomous driving by integrating a holistic, neuro-symbolic feature representation with a soft symbolic reward. It introduces Vector Semantic Representation (VSR) to encode semantic, spatial, and shape information of dynamic scene elements, and a Soft First-Order Logic (SFOL) reward to balance safety, efficiency, and comfort through symbolic reasoning. Empirical evaluation in CARLA under occluded-pedestrian scenarios demonstrates that fusing VSR with SFOL—especially via additive integration—yields more robust policies with higher collision avoidance and stable control across varying traffic densities. The approach advances human-centered, context-aware decision-making in autonomous driving by coupling structured scene understanding with value-aligned reward shaping, paving the way for safer real-world deployment.

Abstract

The problem with existing camera-based Deep Reinforcement Learning approaches is twofold: they rarely integrate high-level scene context into the feature representation, and they rely on rigid, fixed reward functions. To address these challenges, this paper proposes a novel pipeline that produces a neuro-symbolic feature representation that encompasses semantic, spatial, and shape information, as well as spatially boosted features of dynamic entities in the scene, with an emphasis on safety-critical road users. It also proposes a Soft First-Order Logic (SFOL) reward function that balances human values via a symbolic reasoning module. Here, semantic and spatial predicates are extracted from segmentation maps and applied to linguistic rules to obtain reward weights. Quantitative experiments in the CARLA simulation environment show that the proposed neuro-symbolic representation and SFOL reward function improved policy robustness and safety-related performance metrics compared to baseline representations and reward formulations across varying traffic densities and occlusion levels. The findings demonstrate that integrating holistic representations and soft reasoning into Reinforcement Learning can support more context-aware and value-aligned decision-making for autonomous driving.
Paper Structure (33 sections, 12 equations, 7 figures, 5 tables)

This paper contains 33 sections, 12 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Overview of the proposed pipeline. Grayscale camera images are passed through two parallel streams: a ResNet18 autoencoder that produces latent feature embeddings, and a ResNet18 UNet model to generate semantic maps. This map serves two functions: (1) to apply spatial attention to the latent features, and (2) to formulate a hypervector that encodes entity semantics and spatial relations. The resultant hypervector is combined with the spatial features to produce the final VSR vector, which is then fed into a PPO agent for longitudinal control.
  • Figure 2: An illustration of how the $SSI_i$ hypervector is formed for an entity $E_i$ (e.g., pedestrian).
  • Figure 3: Illustration of the SFOL reward computation. Predicates extracted from the semantic map are applied to linguistic rules, producing rule confidence values that determine the safety and efficiency reward weights in Equation \ref{['eqn:final_reward']}.
  • Figure 4: Illustration of the occluded pedestrian crossing scenario in CARLA. The ego vehicle (circled in Green) starts at one end of the road and must navigate toward the goal whilst another vehicle (circled in Black) occludes part of the crosswalk. As it approaches, a pedestrian (circled in Red) enters the crosswalk, requiring the ego vehicle to yield before proceeding safely.
  • Figure 5: Mean reward over training timesteps for different VSR integration strategies: via element-wise addition ($SSI \oplus SF$ in Green), and via element-wise multiplication ($SSI \otimes SF$, in Blue), as well as using $SSI$ hypervector only ($SSI$ Only, in Magenta), and using the $SF$ embeddings only ($SF$ Only, in Yellow).
  • ...and 2 more figures