SwitchMT: An Adaptive Context Switching Methodology for Scalable Multi-Task Learning in Intelligent Autonomous Agents

Avaneesh Devkota; Rachmad Vidya Wicaksana Putra; Muhammad Shafique

SwitchMT: An Adaptive Context Switching Methodology for Scalable Multi-Task Learning in Intelligent Autonomous Agents

Avaneesh Devkota, Rachmad Vidya Wicaksana Putra, Muhammad Shafique

TL;DR

SwitchMT tackles the problem of scalable multi-task reinforcement learning for autonomous agents operating in dynamic environments with data streams, where fixed task-switching schedules cause inefficiency and interference. It introduces an adaptive task-switching policy coupled with a Deep Spiking Q-Network that features active dendrites and a dueling architecture, selecting the MTSpark_ADD base network for enhanced context specialization. The switching policy leverages both rewards and real-time internal dynamics, using a parameter-change metric $\Delta \theta$ to decide when to switch tasks, which reduces wasted training time and penalties from overfitting. Empirical results on Atari benchmarks show SwitchMT achieving competitive performance against state-of-the-art methods and human baselines, while offering improved learning efficiency and generalization for multi-task RL in autonomous agents.

Abstract

The ability to train intelligent autonomous agents (such as mobile robots) on multiple tasks is crucial for adapting to dynamic real-world environments. However, state-of-the-art reinforcement learning (RL) methods only excel in single-task settings, and still struggle to generalize across multiple tasks due to task interference. Moreover, real-world environments also demand the agents to have data stream processing capabilities. Toward this, a state-of-the-art work employs Spiking Neural Networks (SNNs) to improve multi-task learning by exploiting temporal information in data stream, while enabling lowpower/energy event-based operations. However, it relies on fixed context/task-switching intervals during its training, hence limiting the scalability and effectiveness of multi-task learning. To address these limitations, we propose SwitchMT, a novel adaptive task-switching methodology for RL-based multi-task learning in autonomous agents. Specifically, SwitchMT employs the following key ideas: (1) a Deep Spiking Q-Network with active dendrites and dueling structure, that utilizes task-specific context signals to create specialized sub-networks; and (2) an adaptive task-switching policy that leverages both rewards and internal dynamics of the network parameters. Experimental results demonstrate that SwitchMT achieves superior performance in multi-task learning compared to state-of-the-art methods. It achieves competitive scores in multiple Atari games (i.e., Pong: -8.8, Breakout: 5.6, and Enduro: 355.2) compared to the state-of-the-art, showing its better generalized learning capability. These results highlight the effectiveness of our SwitchMT methodology in addressing task interference while enabling multi-task learning automation through adaptive task switching, thereby paving the way for more efficient generalist agents with scalable multi-task learning capabilities.

SwitchMT: An Adaptive Context Switching Methodology for Scalable Multi-Task Learning in Intelligent Autonomous Agents

TL;DR

Abstract

SwitchMT: An Adaptive Context Switching Methodology for Scalable Multi-Task Learning in Intelligent Autonomous Agents

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)