Strongly Solving 2048 4x3
Tomoyuki Kaneko, Shuhei Yamashita
TL;DR
The paper tackles the problem of strongly solving a stochastic single-player game, 2048 on a $4\times3$ board. It introduces an age-based partitioning strategy, where the age is the sum of tile values and remains invariant through afterstate transitions, enabling memory-bounded forward/backward dynamic programming to identify an exact optimal policy and value function for all reachable states. Key contributions include the large-scale enumeration with about $1.15\times10^{12}$ states and $7.40\times10^{11}$ afterstates, a compact Elias-Fano based storage scheme that reduces disk usage to about $1.4$ TiB, and practical guidance for optimal playing that can be further compressed to a few hundred GiB. The work provides a valuable dataset and methodological blueprint for solving other large stochastic single-player games, with implications for verification, benchmarking, and insights into game difficulty across stages.
Abstract
2048 is a stochastic single-player game involving 16 cells on a 4 by 4 grid, where a player chooses a direction among up, down, left, and right to obtain a score by merging two tiles with the same number located in neighboring cells along the chosen direction. This paper presents that a variant 2048-4x3 12 cells on a 4 by 3 board, one row smaller than the original, has been strongly solved. In this variant, the expected score achieved by an optimal strategy is about $50724.26$ for the most common initial states: ones with two tiles of number 2. The numbers of reachable states and afterstates are identified to be $1,152,817,492,752$ and $739,648,886,170$, respectively. The key technique is to partition state space by the sum of tile numbers on a board, which we call the age of a state. An age is invariant between a state and its successive afterstate after any valid action and is increased two or four by stochastic response from the environment. Therefore, we can partition state space by ages and enumerate all (after)states of an age depending only on states with the recent ages. Similarly, we can identify (after)state values by going along with ages in decreasing order.
