Adaptive operator selection utilising generalised experience
Mehmet Emin Aydin, Rafet Durgut, Abdur Rakib
TL;DR
The paper tackles the challenge of balancing exploration and exploitation in combinatorial optimisation by introducing a reinforcement learning–based adaptive operator selection framework within the Artificial Bee Colony. It advances a generalisable approach that (i) orchestrates multiple neighbourhood operators, (ii) uses a feature-based state representation to enable size‑invariant learning, and (iii) partitions the search space and applies transfer learning to reuse experience across problem instances. Empirical results on OneMax and Set Union Knapsack Problems show that feature-based states and sub-space splitting improve performance, while transfer learning yields mixed but often beneficial effects depending on problem type and configuration. Overall, the work demonstrates a scalable, transferable framework for adaptive operator selection, with clear avenues for deeper RL models, multi-agent setups, and expanded operator pools in future work.
Abstract
Optimisation problems, particularly combinatorial optimisation problems, are difficult to solve due to their complexity and hardness. Such problems have been successfully solved by evolutionary and swarm intelligence algorithms, especially in binary format. However, the approximation may suffer due to the the issues in balance between exploration and exploitation activities (EvE), which remain as the major challenge in this context. Although the complementary usage of multiple operators is becoming more popular for managing EvE with adaptive operator selection schemes, a bespoke adaptive selection system is still an important topic in research. Reinforcement Learning (RL) has recently been proposed as a way to customise and shape up a highly effective adaptive selection system. However, it is still challenging to handle the problem in terms of scalability. This paper proposes and assesses a RL-based novel approach to help develop a generalised framework for gaining, processing, and utilising the experiences for both the immediate and future use. The experimental results support the proposed approach with a certain level of success.
