Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning

Haozhe Ma; Zhengding Luo; Thanh Vinh Vo; Kuankuan Sima; Tze-Yun Leong

Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning

Haozhe Ma, Zhengding Luo, Thanh Vinh Vo, Kuankuan Sima, Tze-Yun Leong

TL;DR

This paper tackles sparse-reward challenges in multi-task reinforcement learning by introducing CenRA, a framework that couples a Centralized Reward Agent (CRA) with multiple policy agents. The CRA distills cross-task knowledge into dense, task-informed knowledge rewards and distributes them back to policy agents to accelerate learning, while an information synchronization mechanism balances knowledge sharing based on task similarity and real-time learning progress. Empirical results across discrete and continuous domains, notably Meta-World and additional benchmarks, show CenRA achieves faster convergence, robust transfer to unseen tasks, and more stable, balanced performance than strong baselines. The work highlights the practical impact of centralized reward shaping for efficient, transferable multi-task RL, while also outlining limitations and avenues for adaptive weighting and heterogeneous-task extensions.

Abstract

Reward shaping is effective in addressing the sparse-reward challenge in reinforcement learning (RL) by providing immediate feedback through auxiliary, informative rewards. Based on the reward shaping strategy, we propose a novel multi-task reinforcement learning framework that integrates a centralized reward agent (CRA) and multiple distributed policy agents. The CRA functions as a knowledge pool, aimed at distilling knowledge from various tasks and distributing it to individual policy agents to improve learning efficiency. Specifically, the shaped rewards serve as a straightforward metric for encoding knowledge. This framework not only enhances knowledge sharing across established tasks but also adapts to new tasks by transferring meaningful reward signals. We validate the proposed method on both discrete and continuous domains, including the representative Meta-World benchmark, demonstrating its robustness in multi-task sparse-reward settings and its effective transferability to unseen tasks.

Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning

TL;DR

Abstract

Paper Structure (21 sections, 5 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 21 sections, 5 equations, 7 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Preliminaries
Methodology
Knowledge Distillation and Distribution
Problem Formulation
Centralized Reward Agent
Policy Agents with Knowledge Rewards
Information Synchronization of Policy Agents
Overall Framework
Experiments
Comparative Evaluation in MTRL
Knowledge Transfer to New Tasks
Effect of Sampling Weight
Discussion and Conclusion
...and 6 more sections

Figures (7)

Figure 1: A high-level illustration of the CenRA framework. The centralized reward agent functions as a knowledge repository, distilling information from various tasks and distributing it to individual policy agents to enhance learning efficiency.
Figure 2: Environments with multiple tasks. (a) Meta-World: two sparse-reward versions are used: ML10-sparse and ML50-sparse, including diverse robotic manipulation tasks. (b) 2DMaze: 2D maze tasks where the agent must pick up a key and then pass through a door to exit. (c) 3DPickup: 3D maze tasks where the agent aims to navigate to and pick up different target objects at different locations. (d) MujocoCar: mujoco-based race car aims to navigate to different specified areas.
Figure 3: Comparison of CenRA with baselines in 2DMaze, 3DPickup, and MujocoCar domains.
Figure 4: Experimental results for knowledge transfer to new tasks.
Figure 5: Illustration of multiple tasks in different domains in our experiments.
...and 2 more figures

Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning

TL;DR

Abstract

Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)