Versor: A Geometric Sequence Architecture
Truong Minh Huy, Edward Hirst
TL;DR
Versor proposes a Conformal Geometric Algebra–based sequence architecture that embeds states on the Spin(4,1) manifold to enforce $SE(3)$-equivariance via rotor transformations. The core mechanisms, GPA and RRA, deliver interpretable proximity and orientation attention and linear-time sequence processing with manifold-normalized stability, achieving state-of-the-art or competitive results across chaotic dynamics, topology, and multimodal benchmarks with far fewer parameters than Euclidean baselines. The work demonstrates strong zero-shot generalization, robust distribution shift resilience, and substantial hardware-speedups from bit-masked Clifford kernels, signaling a potential shift toward geometrically aware AI for scientific modeling. It also outlines concrete future directions, including Lie-manifold optimization, Hamiltonian extensions, and dedicated geometric accelerators (GAPU) to further harness the benefits of Clifford-based architectures in real-world deployments.
Abstract
A novel sequence architecture design is introduced, Versor, which uses Conformal Geometric Algebra (CGA) in place of the traditional fundamental non-linear operations to achieve structural generalization and significant performance improvements on a variety of tasks, while offering improved interpretability and efficiency. By embedding states in the $Cl_{4,1}$ manifold and evolving them via geometric transformations (rotors), Versor natively represents $SE(3)$-equivariant relationships without requiring explicit structural encoding. Versor is validated on chaotic N-body dynamics, topological reasoning, and standard multimodal benchmarks (CIFAR-10, WikiText-103), consistently outperforming Transformers, Graph Networks, and geometric baselines (GATr, EGNN). Key results include: orders of magnitude fewer parameters ($200\times$ vs. Transformers); interpretable attention decomposing into proximity and orientational components; zero-shot scale generalization (99.3% MCC on topology vs. 50.4% for ViT); and $O(L)$ linear complexity via the novel Recursive Rotor Accumulator. In out-of-distribution tests, Versor maintains stable predictions while Transformers fail catastrophically. Custom Clifford kernels achieve up to $78\times$ speedup, providing a scalable foundation for geometrically-aware scientific modeling.
