Neural Spacetimes for DAG Representation Learning

Haitz Sáez de Ocáriz Borde; Anastasis Kratsios; Marc T. Law; Xiaowen Dong; Michael Bronstein

Neural Spacetimes for DAG Representation Learning

Haitz Sáez de Ocáriz Borde, Anastasis Kratsios, Marc T. Law, Xiaowen Dong, Michael Bronstein

TL;DR

Neural Spacetimes (NSTs) address DAG representation learning by embedding nodes as events in a trainable spacetime manifold, decomposing the representation into a space via a neural quasi-metric $\mathcal{D}$ and a time via a neural partial order $\mathcal{T}$, all guided by a learnable encoder $\mathcal{E}$. The framework provides a global embedding guarantee: for any finite DAG with $k$ nodes, there exists an NST with distortion $1+\mathcal{O}(\log k)$ that preserves causality, with parameter complexity $\tilde{\mathcal{O}}(D+W+k^{5/2}D^4N)$; planar posets can be embedded with $D=\mathcal{O}(\log k)$ spatial dimensions and $W=2$ time dimensions. NSTs are trained end-to-end and show lower embedding distortions than fixed-spacetime baselines on both synthetic and real-world graphs, illustrating the practical benefits of learning geometry. By jointly optimizing space and time, NSTs enable scalable, data-driven causal embeddings for directed graphs with potential downstream impact on causal inference and network analysis via richer, adaptive representations.

Abstract

We propose a class of trainable deep learning-based geometries called Neural Spacetimes (NSTs), which can universally represent nodes in weighted directed acyclic graphs (DAGs) as events in a spacetime manifold. While most works in the literature focus on undirected graph representation learning or causality embedding separately, our differentiable geometry can encode both graph edge weights in its spatial dimensions and causality in the form of edge directionality in its temporal dimensions. We use a product manifold that combines a quasi-metric (for space) and a partial order (for time). NSTs are implemented as three neural networks trained in an end-to-end manner: an embedding network, which learns to optimize the location of nodes as events in the spacetime manifold, and two other networks that optimize the space and time geometries in parallel, which we call a neural (quasi-)metric and a neural partial order, respectively. The latter two networks leverage recent ideas at the intersection of fractal geometry and deep learning to shape the geometry of the representation space in a data-driven fashion, unlike other works in the literature that use fixed spacetime manifolds such as Minkowski space or De Sitter space to embed DAGs. Our main theoretical guarantee is a universal embedding theorem, showing that any $k$-point DAG can be embedded into an NST with $1+\mathcal{O}(\log(k))$ distortion while exactly preserving its causal structure. The total number of parameters defining the NST is sub-cubic in $k$ and linear in the width of the DAG. If the DAG has a planar Hasse diagram, this is improved to $\mathcal{O}(\log(k)) + 2)$ spatial and 2 temporal dimensions. We validate our framework computationally with synthetic weighted DAGs and real-world network embeddings; in both cases, the NSTs achieve lower embedding distortions than their counterparts using fixed spacetime geometries.

Neural Spacetimes for DAG Representation Learning

TL;DR

Neural Spacetimes (NSTs) address DAG representation learning by embedding nodes as events in a trainable spacetime manifold, decomposing the representation into a space via a neural quasi-metric

and a time via a neural partial order

, all guided by a learnable encoder

. The framework provides a global embedding guarantee: for any finite DAG with

nodes, there exists an NST with distortion

that preserves causality, with parameter complexity

; planar posets can be embedded with

spatial dimensions and

time dimensions. NSTs are trained end-to-end and show lower embedding distortions than fixed-spacetime baselines on both synthetic and real-world graphs, illustrating the practical benefits of learning geometry. By jointly optimizing space and time, NSTs enable scalable, data-driven causal embeddings for directed graphs with potential downstream impact on causal inference and network analysis via richer, adaptive representations.

Abstract

-point DAG can be embedded into an NST with

distortion while exactly preserving its causal structure. The total number of parameters defining the NST is sub-cubic in

and linear in the width of the DAG. If the DAG has a planar Hasse diagram, this is improved to

spatial and 2 temporal dimensions. We validate our framework computationally with synthetic weighted DAGs and real-world network embeddings; in both cases, the NSTs achieve lower embedding distortions than their counterparts using fixed spacetime geometries.

Paper Structure (37 sections, 7 theorems, 74 equations, 7 figures, 11 tables, 4 algorithms)

This paper contains 37 sections, 7 theorems, 74 equations, 7 figures, 11 tables, 4 algorithms.

Introduction
Preliminaries: Directed Graphs, Posets, and Quasi-Metrics
Causal Structure
Spatial Structure
Spacetime Embeddings
Neural Spacetimes
Embedding Guarantees
Computational Implementation
Experimental Results
Conclusion
Additional Background
Dimension and Size of a Metric Spaces
Invertible Positive Matrices
Neural Snowflakes
Pseudo-Riemannian Manifolds and Lorentzian Spacetimes
...and 22 more sections

Key Result

Proposition 1

If $T\in \mathbb{N}_+$, $D=0$, and $\mathcal{T}:\mathbb{R}^{D+T}\to \mathbb{R}^T$ admits a representation as equation eq:neural_spacetime, then $\lesssim^{\mathcal{T}}$ is a partial order on $\mathbb{R}^{D+T}$. See proof:prop:neuralspacetime for proof.

Figures (7)

Figure 1: A Neural Spacetime (NST) is a learnable triplet $\mathcal{S}=(\mathcal{E},\mathcal{D},\mathcal{T})$, where $\mathcal{E}:\mathbb{R}^{N}\rightarrow\mathbb{R}^{D+T}$ is a (feature) encoder network, $\mathcal{D}:\mathbb{R}^{D+T}\times \mathbb{R}^{D+T}\to [0,\infty)$ is a learnable quasi-metric on $\mathbb{R}^{D}$ and $\mathcal{T}:\mathbb{R}^{D+T}\to\mathbb{R}^{T}$ is a learnable partial order on $\mathbb{R}^{T}$. Given an input Directed Acyclic Graph (DAG), $\mathcal{E}$ optimizes the location of the nodes $u,v,w$ as events in the spacetime manifold $\hat{u},\hat{v},\hat{w}$, while concurrently $\mathcal{D}$ and $\mathcal{T}$ learn the geometry of space and time themselves. The objective is to find a geometry that can faithfully represent, with minimal distortion, the metric geometry of the input DAG in space as well as its causal connectivity in time.
Figure 2: The red arrows illustrate the Hasse diagram (DAG) with directed edge set $\{ (A, B), (B, C), (B, D), (D, E) \}$. Its corresponding poset is depicted using green arrows on the same vertex set $\{A,B,C,D,E\}$. Our partial order is not a total order as there is no red or green arrow between $C$ and $E$. Red arrows encode key DAG structure and green arrows encode all their possible compositions.
Figure 3: Spacetime Embeddings (Definition \ref{['defn:spacetimeEmbedding']}): We illustrate a spacetime embedding of the directed graph in Figure \ref{['fig:Hasse_Poset']} into $\mathbb{R}^4=\mathbb{R}^2\boldsymbol{\times} \mathbb{R}^2$ with $2$-space dimensions and $2$ time dimensions. Notice that the spatial component of the spacetime embedding is not a causal embedding and vice versa.
Figure 4: Koch snowflake with increasing number of refinement iterations from left to right.
Figure 5: NST activation visualizations.
...and 2 more figures

Theorems & Definitions (29)

Remark 1: Feature Vectors and Nodes
Example 1: From DAGs to Posets
Example 2: From Posets to DAGS: Hasse Diagrams
Example 3
Definition 1: Causal (Quasi-)Metric Space
Definition 2: Spacetime Embedding
Remark 2: Optimal Distortion
Definition 3: Neural (Quasi-)Metric Activation
Proposition 1: Neural Spacetimes Always Implement Partial Orders
Remark 3: Comparing Neural Snowflakes to Neural (Quasi-)metrics in NSTs.
...and 19 more

Neural Spacetimes for DAG Representation Learning

TL;DR

Abstract

Neural Spacetimes for DAG Representation Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (29)