Table of Contents
Fetching ...

Modulation of temporal decision-making in a deep reinforcement learning agent under the dual-task paradigm

Amrapali Pednekar, Álvaro Garrido-Pérez, Yara Khaluf, Pieter Simoens

TL;DR

This work examines how a concurrent cognitive task alters temporal production in DRL agents trained on a simplified OverCooked-inspired environment. Using two task variants—single-task (T) and dual-task (T+N) with four target durations ($7$, $8$, $9$, $10$)—the study finds that dual-task agents overproduce time relative to single-task counterparts, an effect echoing human timing literature. Neural analyses of LSTM dynamics reveal oscillations and rich latent structure but do not provide conclusive evidence for an intrinsic time-keeping mechanism, suggesting the observed bias arises from more complex, distributed dynamics. Overall, the results connect emergent DRL behavior with known temporal interference phenomena in biological systems, offering avenues for deeper exploration of how cognition and timing interact in artificial agents.

Abstract

This study explores the interference in temporal processing within a dual-task paradigm from an artificial intelligence (AI) perspective. In this context, the dual-task setup is implemented as a simplified version of the Overcooked environment with two variations, single task (T) and dual task (T+N). Both variations involve an embedded time production task, but the dual task (T+N) additionally involves a concurrent number comparison task. Two deep reinforcement learning (DRL) agents were separately trained for each of these tasks. These agents exhibited emergent behavior consistent with human timing research. Specifically, the dual task (T+N) agent exhibited significant overproduction of time relative to its single task (T) counterpart. This result was consistent across four target durations. Preliminary analysis of neural dynamics in the agents' LSTM layers did not reveal any clear evidence of a dedicated or intrinsic timer. Hence, further investigation is needed to better understand the underlying time-keeping mechanisms of the agents and to provide insights into the observed behavioral patterns. This study is a small step towards exploring parallels between emergent DRL behavior and behavior observed in biological systems in order to facilitate a better understanding of both.

Modulation of temporal decision-making in a deep reinforcement learning agent under the dual-task paradigm

TL;DR

This work examines how a concurrent cognitive task alters temporal production in DRL agents trained on a simplified OverCooked-inspired environment. Using two task variants—single-task (T) and dual-task (T+N) with four target durations (, , , )—the study finds that dual-task agents overproduce time relative to single-task counterparts, an effect echoing human timing literature. Neural analyses of LSTM dynamics reveal oscillations and rich latent structure but do not provide conclusive evidence for an intrinsic time-keeping mechanism, suggesting the observed bias arises from more complex, distributed dynamics. Overall, the results connect emergent DRL behavior with known temporal interference phenomena in biological systems, offering avenues for deeper exploration of how cognition and timing interact in artificial agents.

Abstract

This study explores the interference in temporal processing within a dual-task paradigm from an artificial intelligence (AI) perspective. In this context, the dual-task setup is implemented as a simplified version of the Overcooked environment with two variations, single task (T) and dual task (T+N). Both variations involve an embedded time production task, but the dual task (T+N) additionally involves a concurrent number comparison task. Two deep reinforcement learning (DRL) agents were separately trained for each of these tasks. These agents exhibited emergent behavior consistent with human timing research. Specifically, the dual task (T+N) agent exhibited significant overproduction of time relative to its single task (T) counterpart. This result was consistent across four target durations. Preliminary analysis of neural dynamics in the agents' LSTM layers did not reveal any clear evidence of a dedicated or intrinsic timer. Hence, further investigation is needed to better understand the underlying time-keeping mechanisms of the agents and to provide insights into the observed behavioral patterns. This study is a small step towards exploring parallels between emergent DRL behavior and behavior observed in biological systems in order to facilitate a better understanding of both.

Paper Structure

This paper contains 9 sections, 6 figures.

Figures (6)

  • Figure 1: Grid worlds representing a simplified version of the OverCooked environment carroll2019utility, used for the single task (T) and dual task (T+N) experiments (icons were sourced from Flaticon.com).
  • Figure 2: Distribution of oven timers corresponding to the 'first oven check' in 25 episodes (each of 100 time steps) across the two task types, shown for different target durations. The dual task (T+N) agent tends to significantly overestimate (p < 0.001) as compared to its single task (T) counterpart. The corresponding independent t-test statistics are shown in the plots.
  • Figure 3: Average number of soups produced across the 25 episodes (each of 100 time steps) for the two task types, shown for different target durations.
  • Figure 4: A fast fourier transform (FFT) of the first principal components of the LSTM hidden state activations (across first 100 time steps) for the single task (T) (in green) and the dual task (T+N) (in orange) agents across different durations. The black dotted line marks the target interval frequency.
  • Figure 5: PCA of LSTM hidden state activations across 100 time steps for single task (T) agent
  • ...and 1 more figures