Solving a Rubik's Cube Using its Local Graph Structure
Shunyu Yao, Mitchy Lee
TL;DR
This work treats the Rubik's Cube as a large state-space graph and introduces a local-structure–driven heuristic, the weighted convolutional distance (WCD), to guide A* search. By combining a pre-trained distance predictor $f_d$ (from DeepCubeA) with an action-probability policy $f_p$, the method computes multi-hop neighborhood distances via $d^{(k)}(s)$, using formulas $d^{(1)}(s)=\mu f_d(s)+(1-\mu)\sum_{A} p_{s_A} f_d(s_A)$ and $d^{(k+1)}(s)=\mu d^{(k)}(s)+(1-\mu) f_p(s)^T \mathbf{d}_{adj}^{(k)}(s)$. This locally grounded heuristic improves search directions, reducing the number of explored nodes and, in some cases, the solution length compared to the baseline DeepCubeA, albeit at the cost of higher computation time due to non-matrix convolution. The authors note a path forward through matrix-form implementations and GPU acceleration, with potential applicability to other combinatorial puzzles with graph-like state spaces such as Sokoban and Lights Out, promising broader impact for resource-constrained planning in large discrete domains.
Abstract
The Rubix Cube is a 3-dimensional single-player combination puzzle attracting attention in the reinforcement learning community. A Rubix Cube has six faces and twelve possible actions, leading to a small and unconstrained action space and a very large state space with only one goal state. Modeling such a large state space and storing the information of each state requires exceptional computational resources, which makes it challenging to find the shortest solution to a scrambled Rubix cube with limited resources. The Rubix Cube can be represented as a graph, where states of the cube are nodes and actions are edges. Drawing on graph convolutional networks, we design a new heuristic, weighted convolutional distance, for A star search algorithm to find the solution to a scrambled Rubix Cube. This heuristic utilizes the information of neighboring nodes and convolves them with attention-like weights, which creates a deeper search for the shortest path to the solved state.
