FACTS: A Factored State-Space Framework For World Modelling
Li Nanbo, Firas Laakom, Yucheng Xu, Wenyi Wang, Jürgen Schmidhuber
TL;DR
FACTS proposes a permutable, graph-structured state-space memory with a memory-input routing mechanism to achieve permutation-invariant spatial-temporal world modelling. By representing memory as a set of latent factors and inputs as nodes, and using attention-based routing plus a linearisation trick around an initial memory Z_0, FACTS enables efficient long-horizon sequence modelling with parallelisable updates. Theoretical guarantees show left-permutation equivariance and right-permutation invariance, while empirical results demonstrate competitive or superior performance across multivariate time-series forecasting, object-centric world modelling, and dynamic graph prediction, including robustness to input permutation. This framework offers a general, scalable approach to robust world modelling with efficient history compression and strong cross-domain applicability.
Abstract
World modelling is essential for understanding and predicting the dynamics of complex systems by learning both spatial and temporal dependencies. However, current frameworks, such as Transformers and selective state-space models like Mambas, exhibit limitations in efficiently encoding spatial and temporal structures, particularly in scenarios requiring long-term high-dimensional sequence modelling. To address these issues, we propose a novel recurrent framework, the \textbf{FACT}ored \textbf{S}tate-space (\textbf{FACTS}) model, for spatial-temporal world modelling. The FACTS framework constructs a graph-structured memory with a routing mechanism that learns permutable memory representations, ensuring invariance to input permutations while adapting through selective state-space propagation. Furthermore, FACTS supports parallel computation of high-dimensional sequences. We empirically evaluate FACTS across diverse tasks, including multivariate time series forecasting, object-centric world modelling, and spatial-temporal graph prediction, demonstrating that it consistently outperforms or matches specialised state-of-the-art models, despite its general-purpose world modelling design.
