Geometric Reasoning in the Embedding Space
Jan Hůla, David Mojžíšek, Jiří Janeček, David Herel, Mikoláš Janota
TL;DR
This work studies how neural models can reason about geometric constraints by constructing a synthetic constraint satisfaction problem on a discrete 2D grid. It compares a Graph Neural Network and an autoregressive Transformer on predicting the positions of unknown points defined by constraints M, R, S, T, and fixed points P, revealing that embeddings evolve to reflect the underlying geometry. The results show the GNN significantly outperforms the Transformer and scales to grid sizes up to 80 by 80, while embedding visualizations reveal the emergence of a 2D grid structure in the static embeddings and iterative refinement in the solution process. The findings offer insight into embedding-space mechanisms for geometric reasoning and highlight scalability advantages of GNNs over Transformers in this setting, while outlining limitations and avenues for future work.
Abstract
In this contribution, we demonstrate that Graph Neural Networks and Transformers can learn to reason about geometric constraints. We train them to predict spatial position of points in a discrete 2D grid from a set of constraints that uniquely describe hidden figures containing these points. Both models are able to predict the position of points and interestingly, they form the hidden figures described by the input constraints in the embedding space during the reasoning process. Our analysis shows that both models recover the grid structure during training so that the embeddings corresponding to the points within the grid organize themselves in a 2D subspace and reflect the neighborhood structure of the grid. We also show that the Graph Neural Network we design for the task performs significantly better than the Transformer and is also easier to scale.
