Modulation of temporal decision-making in a deep reinforcement learning agent under the dual-task paradigm
Amrapali Pednekar, Álvaro Garrido-Pérez, Yara Khaluf, Pieter Simoens
TL;DR
This work examines how a concurrent cognitive task alters temporal production in DRL agents trained on a simplified OverCooked-inspired environment. Using two task variants—single-task (T) and dual-task (T+N) with four target durations ($7$, $8$, $9$, $10$)—the study finds that dual-task agents overproduce time relative to single-task counterparts, an effect echoing human timing literature. Neural analyses of LSTM dynamics reveal oscillations and rich latent structure but do not provide conclusive evidence for an intrinsic time-keeping mechanism, suggesting the observed bias arises from more complex, distributed dynamics. Overall, the results connect emergent DRL behavior with known temporal interference phenomena in biological systems, offering avenues for deeper exploration of how cognition and timing interact in artificial agents.
Abstract
This study explores the interference in temporal processing within a dual-task paradigm from an artificial intelligence (AI) perspective. In this context, the dual-task setup is implemented as a simplified version of the Overcooked environment with two variations, single task (T) and dual task (T+N). Both variations involve an embedded time production task, but the dual task (T+N) additionally involves a concurrent number comparison task. Two deep reinforcement learning (DRL) agents were separately trained for each of these tasks. These agents exhibited emergent behavior consistent with human timing research. Specifically, the dual task (T+N) agent exhibited significant overproduction of time relative to its single task (T) counterpart. This result was consistent across four target durations. Preliminary analysis of neural dynamics in the agents' LSTM layers did not reveal any clear evidence of a dedicated or intrinsic timer. Hence, further investigation is needed to better understand the underlying time-keeping mechanisms of the agents and to provide insights into the observed behavioral patterns. This study is a small step towards exploring parallels between emergent DRL behavior and behavior observed in biological systems in order to facilitate a better understanding of both.
