Table of Contents
Fetching ...

Human-Inspired Pavlovian and Instrumental Learning for Autonomous Agent Navigation

Jingfeng Shan, Francesco Guidi, Mehrdad Saeidi, Enrico Testi, Elia Favarelli, Andrea Giorgetti, Davide Dardari, Alberto Zanella, Giorgio Li Pira, Francesca Starita, Anna Guerra

Abstract

Autonomous agents operating in uncertain environments must balance fast responses with goal-directed planning. Classical MF RL often converges slowly and may induce unsafe exploration, whereas MB methods are computationally expensive and sensitive to model mismatch. This paper presents a human-inspired hybrid RL architecture integrating Pavlovian, Instrumental MF, and Instrumental MB components. Inspired by Pavlovian and Instrumental learning from neuroscience, the framework considers contextual radio cues, here intended as georeferenced environmental features acting as CS, to shape intrinsic value signals and bias decision-making. Learning is further modulated by internal motivational drives through a dedicated motivational signal. A Bayesian arbitration mechanism adaptively blends MF and MB estimates based on predicted reliability. Simulation results show that the hybrid approach accelerates learning, improves operational safety, and reduces navigation in high-uncertainty regions compared to standard RL baselines. Pavlovian conditioning promotes safer exploration and faster convergence, while arbitration enables a smooth transition from exploration to efficient, plan-driven exploitation. Overall, the results highlight the benefits of biologically inspired modularity for robust and adaptive autonomous systems under uncertainty.

Human-Inspired Pavlovian and Instrumental Learning for Autonomous Agent Navigation

Abstract

Autonomous agents operating in uncertain environments must balance fast responses with goal-directed planning. Classical MF RL often converges slowly and may induce unsafe exploration, whereas MB methods are computationally expensive and sensitive to model mismatch. This paper presents a human-inspired hybrid RL architecture integrating Pavlovian, Instrumental MF, and Instrumental MB components. Inspired by Pavlovian and Instrumental learning from neuroscience, the framework considers contextual radio cues, here intended as georeferenced environmental features acting as CS, to shape intrinsic value signals and bias decision-making. Learning is further modulated by internal motivational drives through a dedicated motivational signal. A Bayesian arbitration mechanism adaptively blends MF and MB estimates based on predicted reliability. Simulation results show that the hybrid approach accelerates learning, improves operational safety, and reduces navigation in high-uncertainty regions compared to standard RL baselines. Pavlovian conditioning promotes safer exploration and faster convergence, while arbitration enables a smooth transition from exploration to efficient, plan-driven exploitation. Overall, the results highlight the benefits of biologically inspired modularity for robust and adaptive autonomous systems under uncertainty.
Paper Structure (3 sections, 3 figures)

This paper contains 3 sections, 3 figures.

Figures (3)

  • Figure 6: $P_{\text{MB}}$ as a function of episodes. $P_{\text{MB}}$ averaged over the number of agents (denoted as $\hat{P}_{\text{MB}}$) between Instrumental MF-MB (blue solid line) and the proposed PIT MF-MB (Hybrid) (red solid line); and $P_{\text{MB}}$ between agents with Instrumental MF-MB and the proposed PIT MF-MB (Hybrid) algorithm. The shaded areas bound the minimum and maximum values across the agents.
  • Figure 7: Agent final trajectories with (a) Instrumental MF-MB, (b) PIT MF-MB (Hybrid) learning approaches.
  • Figure 8: Learning rates expressed as steps per episodes and for various combinations of learning types. "w/" and "w/o" refer to the presence, or not, of the motivational signal.