Table of Contents
Fetching ...

Learning Where, What and How to Transfer: A Multi-Role Reinforcement Learning Approach for Evolutionary Multitasking

Jiajun Zhan, Zeyuan Ma, Yue-Jiao Gong, Kay Chen Tan

TL;DR

This work addresses EMT by learning a holistic, generalizable policy that decides where, what, and how to transfer knowledge across tasks. It introduces MetaMTO, a multi-role RL system with a Task Routing agent, a Knowledge Control agent, and a Transfer Strategy Adaption group, trained end-to-end on an augmented multitask distribution (AWCCI) using PPO. Empirical results show state-of-the-art performance against both human-crafted and learning-assisted baselines, with ablations and interpretability analyses revealing that intelligent routing and adaptive transfer strategies drive the gains. The approach offers a scalable, data-driven framework for automated EMT, with practical implications for robust multitask optimization in dynamic, multi-task environments.

Abstract

Evolutionary multitasking (EMT) algorithms typically require tailored designs for knowledge transfer, in order to assure convergence and optimality in multitask optimization. In this paper, we explore designing a systematic and generalizable knowledge transfer policy through Reinforcement Learning. We first identify three major challenges: determining the task to transfer (where), the knowledge to be transferred (what) and the mechanism for the transfer (how). To address these challenges, we formulate a multi-role RL system where three (groups of) policy networks act as specialized agents: a task routing agent incorporates an attention-based similarity recognition module to determine source-target transfer pairs via attention scores; a knowledge control agent determines the proportion of elite solutions to transfer; and a group of strategy adaptation agents control transfer strength by dynamically controlling hyper-parameters in the underlying EMT framework. Through pre-training all network modules end-to-end over an augmented multitask problem distribution, a generalizable meta-policy is obtained. Comprehensive validation experiments show state-of-the-art performance of our method against representative baselines. Further in-depth analysis not only reveals the rationale behind our proposal but also provide insightful interpretations on what the system have learned.

Learning Where, What and How to Transfer: A Multi-Role Reinforcement Learning Approach for Evolutionary Multitasking

TL;DR

This work addresses EMT by learning a holistic, generalizable policy that decides where, what, and how to transfer knowledge across tasks. It introduces MetaMTO, a multi-role RL system with a Task Routing agent, a Knowledge Control agent, and a Transfer Strategy Adaption group, trained end-to-end on an augmented multitask distribution (AWCCI) using PPO. Empirical results show state-of-the-art performance against both human-crafted and learning-assisted baselines, with ablations and interpretability analyses revealing that intelligent routing and adaptive transfer strategies drive the gains. The approach offers a scalable, data-driven framework for automated EMT, with practical implications for robust multitask optimization in dynamic, multi-task environments.

Abstract

Evolutionary multitasking (EMT) algorithms typically require tailored designs for knowledge transfer, in order to assure convergence and optimality in multitask optimization. In this paper, we explore designing a systematic and generalizable knowledge transfer policy through Reinforcement Learning. We first identify three major challenges: determining the task to transfer (where), the knowledge to be transferred (what) and the mechanism for the transfer (how). To address these challenges, we formulate a multi-role RL system where three (groups of) policy networks act as specialized agents: a task routing agent incorporates an attention-based similarity recognition module to determine source-target transfer pairs via attention scores; a knowledge control agent determines the proportion of elite solutions to transfer; and a group of strategy adaptation agents control transfer strength by dynamically controlling hyper-parameters in the underlying EMT framework. Through pre-training all network modules end-to-end over an augmented multitask problem distribution, a generalizable meta-policy is obtained. Comprehensive validation experiments show state-of-the-art performance of our method against representative baselines. Further in-depth analysis not only reveals the rationale behind our proposal but also provide insightful interpretations on what the system have learned.

Paper Structure

This paper contains 35 sections, 15 equations, 8 figures, 6 tables, 1 algorithm.

Figures (8)

  • Figure 1: Conceptual overview of MetaMTO framework.
  • Figure 2: A detailed illustration of the overall workflow within MetaMTO. Following the bi-level architecture in MetaBBO, a multi-role RL system is deployed at the meta-level to control the knowledge transfer scheme at the low-level evolution. A single evolution step is presented for simplicity.
  • Figure 3: Convergence curves on the five augmented test sets AWCCI-VS to AWCCI-VL.
  • Figure 4: Left is the comparison by shifting from training on AWCCI-VS to testing on AWCCI-VL, and right is the comparison by shifting from training on 10 sub-tasks to testing on 30 sub-tasks.
  • Figure 5: Left is the average training time and right is the testing time per task.
  • ...and 3 more figures