Multistability of Self-Attention Dynamics in Transformers
Claudio Altafini
TL;DR
The paper analyzes a continuous-time self-attention model for transformers, reframing it as a multiagent dynamical system on the unit sphere and linking it to a multiagent Oja flow that targets the principal eigenvector of the value matrix $V$. It provides a rigorous classification of equilibria—consensus, bipartite consensus, clustering, and polygonal—and derives stability conditions, showing that multistability is common and that many stable states align with eigenvectors of $V$, often the principal one. Through theoretical results and numerical experiments, it demonstrates that self-attention dynamics can converge to low-rank attractors and that attention weighting introduces rich bifurcation behavior beyond the classic Oja flow. The findings offer a nonlinear Perron–Frobenius perspective on transformer layers, suggesting that successive layers may tilt token representations toward eigenvectors of $V$ and inviting experimental validation on pretrained models.
Abstract
In machine learning, a self-attention dynamics is a continuous-time multiagent-like model of the attention mechanisms of transformers. In this paper we show that such dynamics is related to a multiagent version of the Oja flow, a dynamical system that computes the principal eigenvector of a matrix corresponding for transformers to the value matrix. We classify the equilibria of the ``single-head'' self-attention system into four classes: consensus, bipartite consensus, clustering and polygonal equilibria. Multiple asymptotically stable equilibria from the first three classes often coexist in the self-attention dynamics. Interestingly, equilibria from the first two classes are always aligned with the eigenvectors of the value matrix, often but not exclusively with the principal eigenvector.
