Letters, Colors, and Words: Constructing the Ideal Building Blocks Set
Ricardo Salazar, Shahrzad Jamshidi
TL;DR
The study defines a six-cube, six-color building-block puzzle intended to maximize mono and rainbow words drawn from a short English word dataset. It formulates a high-dimensional combinatorial optimization problem and evaluates multiple heuristics—random search, simulated annealing, three tree-search strategies (constrained greedy, best-first, greedy), reinforcement learning, and a genetic algorithm—under strict alphabet-coverage constraints. The results show that, within the constrained setup, a greedy tree-search with the base permutation achieves the best single performance (~2386 words), while a stochastic GA can reach higher counts (up to 2846 words) when constraints are relaxed; RL fails to converge effectively. These findings illustrate the feasibility of heuristic search for combinatorial language-building puzzles and offer insights for designing educational puzzle toys and scalable versions with more cubes or colors.
Abstract
Define a building blocks set to be a collection of n cubes (each with six sides) where each side is assigned one letter and one color from a palette of m colors. We propose a novel problem of assigning letters and colors to each face so as to maximize the number of words one can spell from a chosen dataset that are either mono words, all letters have the same color, or rainbow words, all letters have unique colors. We explore this problem considering a chosen set of English words, up to six letters long, from a typical vocabulary of a US American 14 year old and explore the problem when n=6 and m=6, with the added restriction that each color appears exactly once on the cube. The problem is intractable, as the size of the solution space makes a brute force approach computationally infeasible. Therefore we aim to solve this problem using random search, simulated annealing, two distinct tree search approaches (greedy and best-first), and a genetic algorithm. To address this, we explore a range of optimization techniques: random search, simulated annealing, two distinct tree search methods (greedy and best-first), and a genetic algorithm. Additionally, we attempted to implement a reinforcement learning approach; however, the model failed to converge to viable solutions within the problem's constraints. Among these methods, the genetic algorithm delivered the best performance, achieving a total of 2846 mono and rainbow words.
