Modeling Local Search Metaheuristics Using Markov Decision Processes

Rubén Ruiz-Torrubiano

Modeling Local Search Metaheuristics Using Markov Decision Processes

Rubén Ruiz-Torrubiano

TL;DR

This work addresses the challenge of understanding and selecting among local search metaheuristics by introducing an MDP-based framework that models local search as policies on a discrete-time, infinite-horizon MDP with explicit rewards. It defines convergence and exploration-exploitation measures, notably the convergence coefficient $\gamma_i^A(t)$ and the exploration-exploitation coefficient $\delta_i^A$, and proves a local search exploration-exploitation theorem. Applying the framework to hill climbing and simulated annealing shows that hill climbing is exploitation-oriented while SA is balanced, aligning with established intuition. The approach provides a theory-grounded basis for choosing metaheuristics for a given problem and can be extended to other local search and population-based methods.

Abstract

Local search metaheuristics like tabu search or simulated annealing are popular heuristic optimization algorithms for finding near-optimal solutions for combinatorial optimization problems. However, it is still challenging for researchers and practitioners to analyze their behaviour and systematically choose one over a vast set of possible metaheuristics for the particular problem at hand. In this paper, we introduce a theoretical framework based on Markov Decision Processes (MDP) for analyzing local search metaheuristics. This framework not only helps in providing convergence results for individual algorithms, but also provides an explicit characterization of the exploration-exploitation tradeoff and a theory-grounded guidance for practitioners for choosing an appropriate metaheuristic for the problem at hand. We present this framework in detail and show how to apply it in the case of hill climbing and the simulated annealing algorithm.

Modeling Local Search Metaheuristics Using Markov Decision Processes

TL;DR

and the exploration-exploitation coefficient

, and proves a local search exploration-exploitation theorem. Applying the framework to hill climbing and simulated annealing shows that hill climbing is exploitation-oriented while SA is balanced, aligning with established intuition. The approach provides a theory-grounded basis for choosing metaheuristics for a given problem and can be extended to other local search and population-based methods.

Abstract

Paper Structure (8 sections, 1 theorem, 17 equations)

This paper contains 8 sections, 1 theorem, 17 equations.

Introduction
Previous Work
Markov Decision Processes for Local Search Metaheuristics
Definitions
Optimal policies
Local search metaheuristics as policies
Simulated Annealing
Conclusions

Key Result

Theorem 1

Let $M$ be a local search MDP. For any local search metaheuristic $A$, there exist a policy $R_A$ such that

Theorems & Definitions (12)

Definition 1: Markov Decision Process
Definition 2
Definition 3
Definition 4
Definition 5: Local search MDP
Example 1: Hill climbing
Definition 6: Exploration and exploitation
Definition 7: Exploration-exploitation function
Theorem 1: Local search exploration-exploitation theorem
proof
...and 2 more

Modeling Local Search Metaheuristics Using Markov Decision Processes

TL;DR

Abstract

Modeling Local Search Metaheuristics Using Markov Decision Processes

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (12)