Unlearning Works Better Than You Think: Local Reinforcement-Based Selection of Auxiliary Objectives
Abderrahim Bendahi, Adrien Fradin, Matthieu Lerasle
TL;DR
This work tackles non-monotonic single-objective optimization by augmenting an evolutionary algorithm with reinforcement learning to select auxiliary objectives, while introducing an unlearning mechanism that discards no-longer-beneficial helpers. The Local Reinforcement-Based Selection of Auxiliary Objectives (LRSAO) uses a local reward structure and a plateau-aware penalty to adaptively switch or discard objectives like LeftBridge and RightBridge as the Jump$_{\ell}$ landscape changes, without requiring restarts. The authors prove a tight runtime bound of $\mathbb{E}(T)=\Theta\left(\frac{n^2}{\ell^2} + n\log n\right)$, improving over the previous $O\left(\frac{n^2\log n}{\ell}\right)$, and corroborate the analysis with experiments up to $n=10^4$. The results demonstrate that locality-driven learning and unlearning can significantly enhance the efficiency of RL-guided objective selection in complex optimization landscapes, with practical implications for adaptive heuristic design in evolutionary computation.
Abstract
We introduce Local Reinforcement-Based Selection of Auxiliary Objectives (LRSAO), a novel approach that selects auxiliary objectives using reinforcement learning (RL) to support the optimization process of an evolutionary algorithm (EA) as in EA+RL framework and furthermore incorporates the ability to unlearn previously used objectives. By modifying the reward mechanism to penalize moves that do no increase the fitness value and relying on the local auxiliary objectives, LRSAO dynamically adapts its selection strategy to optimize performance according to the landscape and unlearn previous objectives when necessary. We analyze and evaluate LRSAO on the black-box complexity version of the non-monotonic Jump function, with gap parameter $\ell$, where each auxiliary objective is beneficial at specific stages of optimization. The Jump function is hard to optimize for evolutionary-based algorithms and the best-known complexity for reinforcement-based selection on Jump was $O(n^2 \log(n) / \ell)$. Our approach improves over this result to achieve a complexity of $Θ(n^2 / \ell^2 + n \log(n))$ resulting in a significant improvement, which demonstrates the efficiency and adaptability of LRSAO, highlighting its potential to outperform traditional methods in complex optimization scenarios.
