Sheaf theory: from deep geometry to deep learning
Anton Ayzenberg, Thomas Gebhart, German Magai, Grigory Solomadin
TL;DR
This paper surveys the use of sheaf theory as a unifying language for geometric and topological data analysis in deep learning, emphasizing posets as a general base for sheaves and introducing practical algorithms for cohomology on arbitrary finite posets. It connects classical sheaf theory, cohomology, and Laplacians with modern DL architectures via sheaf learning and neural diffusion, illustrating improvements over traditional graph networks in heterophilic and higher-order settings. Key contributions include a minimal cochain complex for efficient cohomology computation, Morse-cell poset notions for one-shot calculations, and extensive reviews across categorical, topological, abelian, real, manifold, and data-level applications, plus numerous proposals for future theory and practice. The work highlights how integrating sheaf-theoretic tools can reveal blind spots in current ML pipelines and offers a structured roadmap for developing scalable, interpretable, and topologically informed learning systems.
Abstract
This paper provides an overview of the applications of sheaf theory in deep learning, data science, and computer science in general. The primary text of this work serves as a friendly introduction to applied and computational sheaf theory accessible to those with modest mathematical familiarity. We describe intuitions and motivations underlying sheaf theory shared by both theoretical researchers and practitioners, bridging classical mathematical theory and its more recent implementations within signal processing and deep learning. We observe that most notions commonly considered specific to cellular sheaves translate to sheaves on arbitrary posets, providing an interesting avenue for further generalization of these methods in applications, and we present a new algorithm to compute sheaf cohomology on arbitrary finite posets in response. By integrating classical theory with recent applications, this work reveals certain blind spots in current machine learning practices. We conclude with a list of problems related to sheaf-theoretic applications that we find mathematically insightful and practically instructive to solve. To ensure the exposition of sheaf theory is self-contained, a rigorous mathematical introduction is provided in appendices which moves from an introduction of diagrams and sheaves to the definition of derived functors, higher order cohomology, sheaf Laplacians, sheaf diffusion, and interconnections of these subjects therein.
