Table of Contents
Fetching ...

Evolutionary Reinforcement Learning for Interpretable Decision-Making in Supply Chain Management

Stefano Genetti, Alberto Longobardi, Giovanni Iacca

TL;DR

This paper tackles the need for transparent decision-making in SCM by introducing Evolutionary Learning Decision Trees (ELDT), which fuse Grammatical Evolution with Q-learning to produce interpretable policies. ELDT is embedded in a versatile simulation-based optimization framework that links AnyLogic simulations with Python optimizers via ALPypeOpt, enabling evaluation across two SCM problems: a make-or-buy outsourcing task and a real-world hybrid flow shop scheduling problem for laser-cutting machines. Across synthetic and real datasets, ELDT often matches or exceeds the performance of black-box baselines while delivering simple, interpretable decision trees that reveal actionable rules. The work demonstrates the practical viability of interpretable AI in Industry 4.0 SCM, showing that interpretability need not sacrifice optimization efficiency and highlighting avenues for industrial deployment and further research.

Abstract

In the context of Industry 4.0, Supply Chain Management (SCM) faces challenges in adopting advanced optimization techniques due to the "black-box" nature of most AI-based solutions, which causes reluctance among company stakeholders. To overcome this issue, in this work, we employ an Interpretable Artificial Intelligence (IAI) approach that combines evolutionary computation with Reinforcement Learning (RL) to generate interpretable decision-making policies in the form of decision trees. This IAI solution is embedded within a simulation-based optimization framework specifically designed to handle the inherent uncertainties and stochastic behaviors of modern supply chains. To our knowledge, this marks the first attempt to combine IAI with simulation-based optimization for decision-making in SCM. The methodology is tested on two supply chain optimization problems, one fictional and one from the real world, and its performance is compared against widely used optimization and RL algorithms. The results reveal that the interpretable approach delivers competitive, and sometimes better, performance, challenging the prevailing notion that there must be a trade-off between interpretability and optimization efficiency. Additionally, the developed framework demonstrates strong potential for industrial applications, offering seamless integration with various Python-based algorithms.

Evolutionary Reinforcement Learning for Interpretable Decision-Making in Supply Chain Management

TL;DR

This paper tackles the need for transparent decision-making in SCM by introducing Evolutionary Learning Decision Trees (ELDT), which fuse Grammatical Evolution with Q-learning to produce interpretable policies. ELDT is embedded in a versatile simulation-based optimization framework that links AnyLogic simulations with Python optimizers via ALPypeOpt, enabling evaluation across two SCM problems: a make-or-buy outsourcing task and a real-world hybrid flow shop scheduling problem for laser-cutting machines. Across synthetic and real datasets, ELDT often matches or exceeds the performance of black-box baselines while delivering simple, interpretable decision trees that reveal actionable rules. The work demonstrates the practical viability of interpretable AI in Industry 4.0 SCM, showing that interpretability need not sacrifice optimization efficiency and highlighting avenues for industrial deployment and further research.

Abstract

In the context of Industry 4.0, Supply Chain Management (SCM) faces challenges in adopting advanced optimization techniques due to the "black-box" nature of most AI-based solutions, which causes reluctance among company stakeholders. To overcome this issue, in this work, we employ an Interpretable Artificial Intelligence (IAI) approach that combines evolutionary computation with Reinforcement Learning (RL) to generate interpretable decision-making policies in the form of decision trees. This IAI solution is embedded within a simulation-based optimization framework specifically designed to handle the inherent uncertainties and stochastic behaviors of modern supply chains. To our knowledge, this marks the first attempt to combine IAI with simulation-based optimization for decision-making in SCM. The methodology is tested on two supply chain optimization problems, one fictional and one from the real world, and its performance is compared against widely used optimization and RL algorithms. The results reveal that the interpretable approach delivers competitive, and sometimes better, performance, challenging the prevailing notion that there must be a trade-off between interpretability and optimization efficiency. Additionally, the developed framework demonstrates strong potential for industrial applications, offering seamless integration with various Python-based algorithms.

Paper Structure

This paper contains 8 sections, 3 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: A schematic of the proposed algorithm's internal workings. Blue blocks indicate components related to the evolutionary loop, while red blocks represent elements of the Q-learning loop. Adapted from custode2023evolutionary.
  • Figure 2: Schematic representation of our simulation-optimization framework, utilizing the ALPypeOpt open-source library to establish a bidirectional connection between an AnyLogic simulation and a Python optimization/RL algorithm.
  • Figure 3: Results for the make-or-buy decision problem on list1.
  • Figure 4: Results for the HFS scheduling problem (datasets d1, d4, and d5). The y-axis displays the best makespan (average across 10 runs) found by each algorithm across evaluations. The result of the greedy policy is reported for reference.
  • Figure 5: Results for the HFS scheduling problem. Best DTs found by eldt on datasets d1, d4, and d5 across 10 runs, with leaf nodes colored on a gradient from light (low priority for input order) to dark red (high priority for input order), indicating how the simulation model prioritizes scheduling.