Broad Spectrum Structure Discovery in Large-Scale Higher-Order Networks
John Hood, Caterina De Bacco, Aaron Schein
TL;DR
This work introduces Omni-Hype-SMT, a probabilistic framework for discovering broad mesoscale structure in large-scale hypergraphs by clustering nodes into latent classes and these classes into communities. Using a low-rank, symmetric class-affinity tensor Λ^{(d)} factorized across orders with a shared node-class membership Θ and a class-community matrix W, the model flexibly represents assortative and disassortative patterns and is provably identifiable under practical constraints. The approach yields interpretable structures, improves higher-order link prediction across diverse datasets, and enables fast synthetic hypergraph generation with tunable mesoscale properties. Empirically, it uncovers meaningful drug-class interactions, core-periphery behavior in politics, and cross-domain patterns that extend beyond traditional assortative models, highlighting the importance of modeling omniassortativity in higher-order networks.
Abstract
Complex systems are often driven by higher-order interactions among multiple units, naturally represented as hypergraphs. Understanding dependency structures within these hypergraphs is crucial for understanding and predicting the behavior of complex systems but is made challenging by their combinatorial complexity and computational demands. In this paper, we introduce a class of probabilistic models that efficiently represents and discovers a broad spectrum of mesoscale structure in large-scale hypergraphs. The key insight enabling this approach is to treat classes of similar units as themselves nodes in a latent hypergraph. By modeling observed node interactions through latent interactions among classes using low-rank representations, our approach tractably captures rich structural patterns while ensuring model identifiability. This allows for direct interpretation of distinct node- and class-level structures. Empirically, our model improves link prediction over state-of-the-art methods and discovers interpretable structures in diverse real-world systems, including pharmacological and social networks, advancing the ability to incorporate large-scale higher-order data into the scientific process.
