Attending to Topological Spaces: The Cellular Transformer
Rubén Ballester, Pablo Hernández-García, Mathilde Papillon, Claudio Battiloro, Nina Miolane, Tolga Birdal, Carles Casacuberta, Sergio Escalera, Mustafa Hajij
TL;DR
The Cellular Transformer (CT) extends transformer attention to $2$-dimensional regular cell complexes, enabling self- and cross-attention across multiple cell ranks via incidence relations and cochain signals. It introduces pairwise and general cellular attention, plus topological positional encodings (BSPe, RWPe, TopoSlepiansPE) to embed structural information, and demonstrates state-of-the-art or competitive performance on datasets lifted to cell complexes without tricks like virtual nodes or rewiring. The work highlights that global position encodings often outperform local ones, and that pairwise attention excels when features are heterogeneous across ranks while general attention suits homogeneous settings. Overall, CT bridges topological deep learning and transformer architectures, opening avenues for richer high-order representations and broader applications with cell-complex data.
Abstract
Topological Deep Learning seeks to enhance the predictive performance of neural network models by harnessing topological structures in input data. Topological neural networks operate on spaces such as cell complexes and hypergraphs, that can be seen as generalizations of graphs. In this work, we introduce the Cellular Transformer (CT), a novel architecture that generalizes graph-based transformers to cell complexes. First, we propose a new formulation of the usual self- and cross-attention mechanisms, tailored to leverage incidence relations in cell complexes, e.g., edge-face and node-edge relations. Additionally, we propose a set of topological positional encodings specifically designed for cell complexes. By transforming three graph datasets into cell complex datasets, our experiments reveal that CT not only achieves state-of-the-art performance, but it does so without the need for more complex enhancements such as virtual nodes, in-domain structural encodings, or graph rewiring.
