Evolutionary Reinforcement Learning for Interpretable Decision-Making in Supply Chain Management
Stefano Genetti, Alberto Longobardi, Giovanni Iacca
TL;DR
This paper tackles the need for transparent decision-making in SCM by introducing Evolutionary Learning Decision Trees (ELDT), which fuse Grammatical Evolution with Q-learning to produce interpretable policies. ELDT is embedded in a versatile simulation-based optimization framework that links AnyLogic simulations with Python optimizers via ALPypeOpt, enabling evaluation across two SCM problems: a make-or-buy outsourcing task and a real-world hybrid flow shop scheduling problem for laser-cutting machines. Across synthetic and real datasets, ELDT often matches or exceeds the performance of black-box baselines while delivering simple, interpretable decision trees that reveal actionable rules. The work demonstrates the practical viability of interpretable AI in Industry 4.0 SCM, showing that interpretability need not sacrifice optimization efficiency and highlighting avenues for industrial deployment and further research.
Abstract
In the context of Industry 4.0, Supply Chain Management (SCM) faces challenges in adopting advanced optimization techniques due to the "black-box" nature of most AI-based solutions, which causes reluctance among company stakeholders. To overcome this issue, in this work, we employ an Interpretable Artificial Intelligence (IAI) approach that combines evolutionary computation with Reinforcement Learning (RL) to generate interpretable decision-making policies in the form of decision trees. This IAI solution is embedded within a simulation-based optimization framework specifically designed to handle the inherent uncertainties and stochastic behaviors of modern supply chains. To our knowledge, this marks the first attempt to combine IAI with simulation-based optimization for decision-making in SCM. The methodology is tested on two supply chain optimization problems, one fictional and one from the real world, and its performance is compared against widely used optimization and RL algorithms. The results reveal that the interpretable approach delivers competitive, and sometimes better, performance, challenging the prevailing notion that there must be a trade-off between interpretability and optimization efficiency. Additionally, the developed framework demonstrates strong potential for industrial applications, offering seamless integration with various Python-based algorithms.
