Table of Contents
Fetching ...

Completeness of Unbounded Best-First Minimax and Descent Minimax

Quentin Cohen-Solal

Abstract

In this article, we focus on search algorithms for two-player perfect information games, whose objective is to determine the best possible strategy, and ideally a winning strategy. Unfortunately, some search algorithms for games in the literature are not able to always determine a winning strategy, even with an infinite search time. This is the case, for example, of the following algorithms: Unbounded Best-First Minimax and Descent Minimax, which are core algorithms in state-of-the-art knowledge-free reinforcement learning. They were then improved with the so-called completion technique. However, whether this technique sufficiently improves these algorithms to allow them to always determine a winning strategy remained an open question until now. To answer this question, we generalize the two algorithms (their versions using the completion technique), and we show that any algorithm of this class of algorithms computes the best strategy. Finally, we experimentally show that the completion technique improves winning performance.

Completeness of Unbounded Best-First Minimax and Descent Minimax

Abstract

In this article, we focus on search algorithms for two-player perfect information games, whose objective is to determine the best possible strategy, and ideally a winning strategy. Unfortunately, some search algorithms for games in the literature are not able to always determine a winning strategy, even with an infinite search time. This is the case, for example, of the following algorithms: Unbounded Best-First Minimax and Descent Minimax, which are core algorithms in state-of-the-art knowledge-free reinforcement learning. They were then improved with the so-called completion technique. However, whether this technique sufficiently improves these algorithms to allow them to always determine a winning strategy remained an open question until now. To answer this question, we generalize the two algorithms (their versions using the completion technique), and we show that any algorithm of this class of algorithms computes the best strategy. Finally, we experimentally show that the completion technique improves winning performance.

Paper Structure

This paper contains 18 sections, 12 theorems, 1 equation, 1 table, 4 algorithms.

Key Result

Lemma 4

Let $\hat{s}$ be a state of a perfect two-player game $G$. If $r\left(\hat{s}\right)=1$ before an iteration of a Unbounded Minimax-based algorithm on a certain state $\hat{s}'$ of $G$ then after the iteration, $r\left(\hat{s}\right)$ and $c\left(\hat{s}\right)$ have not changed.

Theorems & Definitions (21)

  • Definition 1
  • Definition 2
  • Definition 3
  • Lemma 4
  • Lemma 5
  • Proposition 6
  • Lemma 7
  • Proposition 8
  • Theorem 9
  • Lemma 10
  • ...and 11 more