Automatic Policy Search using Population-Based Hyper-heuristics for the Integrated Procurement and Perishable Inventory Problem
Leonardo Kanashiro Felizardo, Edoardo Fadda, Mariá Cristina Vasconcelos Nascimento
TL;DR
The paper tackles a highly integrated, multi-item, multi-supplier perishable inventory problem under uncertainty, proposing a discrete-event simulation complemented by a formal mathematical model. It compares traditional baseline policies (COP, BSP, BSP-EW) with population-based hyper-heuristics (GA, EGA, PSO) that build item-level hybrid policies offline. Empirical results across twelve scenarios show that hyper-heuristics consistently outperform baselines, with GA often achieving the best wins and EGA delivering the most robust average performance; however, the approach incurs significant offline training time. The findings demonstrate substantial practical value in using advanced search to jointly optimize supplier selection, heuristic choice, and policy parameters, especially in complex, high-uncertainty environments, and point to avenues for parallelization and future integration with RL methods and multi-echelon networks.
Abstract
This paper addresses the problem of managing perishable inventory under multiple sources of uncertainty, including stochastic demand, unreliable supplier fulfillment, and probabilistic product shelf life. We develop a discrete-event simulation environment to compare two optimization strategies for this multi-item, multi-supplier problem. The first strategy optimizes uniform classic policies (e.g., Constant Order and Base Stock) by tuning their parameters globally, complemented by a direct search to select the best-fitting suppliers for the integrated problem. The second approach is a hyper-heuristic approach, driven by metaheuristics such as a Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). This framework constructs a composite policy by automating the selection of the heuristic type, its parameters, and the sourcing suppliers on an item-by-item basis. Computational results from twelve distinct instances demonstrate that the hyper-heuristic framework consistently identifies superior policies, with GA and EGA exhibiting the best overall performance. Our primary contribution is verifying that this item-level policy construction yields significant performance gains over simpler global policies, thereby justifying the associated computational cost.
