Table of Contents
Fetching ...

Graphcode: Learning from multiparameter persistent homology using graph neural networks

Michael Kerber, Florian Russold

TL;DR

We introduce graphcodes, a practical two-parameter topological descriptor for data filtered along two scales, built by stacking one-parameter persistence diagrams and connecting consecutive diagrams via a bipartite graph to form an embedded graph in $\mathbb{R}^3$ that can be ingested directly by graph neural networks. The construction depends on fixed barcode bases, so the graphcode is not a topological invariant but provides a complete combinatorial description of the bifiltration's persistence module and enables efficient, out-of-order matrix-reduction computation with $O(n^3)$ worst-case complexity. A simple GNN pipeline processes graphcodes through attention layers, per-slice pooling, and dense layers, achieving competitive accuracy against state-of-the-art multiparameter descriptors while often offering faster computation. Experiments on graphs, shapes, and point processes demonstrate strong discriminative performance, with the approach particularly advantageous on larger datasets where the graphcode–GNN combination outperforms one-parameter baselines and other vectorizations. The work generalizes PersLay to bifiltrations and provides a practical bridge between multiparameter persistent homology and modern deep learning, suggesting further gains as bifiltration techniques mature.

Abstract

We introduce graphcodes, a novel multi-scale summary of the topological properties of a dataset that is based on the well-established theory of persistent homology. Graphcodes handle datasets that are filtered along two real-valued scale parameters. Such multi-parameter topological summaries are usually based on complicated theoretical foundations and difficult to compute; in contrast, graphcodes yield an informative and interpretable summary and can be computed as efficient as one-parameter summaries. Moreover, a graphcode is simply an embedded graph and can therefore be readily integrated in machine learning pipelines using graph neural networks. We describe such a pipeline and demonstrate that graphcodes achieve better classification accuracy than state-of-the-art approaches on various datasets.

Graphcode: Learning from multiparameter persistent homology using graph neural networks

TL;DR

We introduce graphcodes, a practical two-parameter topological descriptor for data filtered along two scales, built by stacking one-parameter persistence diagrams and connecting consecutive diagrams via a bipartite graph to form an embedded graph in that can be ingested directly by graph neural networks. The construction depends on fixed barcode bases, so the graphcode is not a topological invariant but provides a complete combinatorial description of the bifiltration's persistence module and enables efficient, out-of-order matrix-reduction computation with worst-case complexity. A simple GNN pipeline processes graphcodes through attention layers, per-slice pooling, and dense layers, achieving competitive accuracy against state-of-the-art multiparameter descriptors while often offering faster computation. Experiments on graphs, shapes, and point processes demonstrate strong discriminative performance, with the approach particularly advantageous on larger datasets where the graphcode–GNN combination outperforms one-parameter baselines and other vectorizations. The work generalizes PersLay to bifiltrations and provides a practical bridge between multiparameter persistent homology and modern deep learning, suggesting further gains as bifiltration techniques mature.

Abstract

We introduce graphcodes, a novel multi-scale summary of the topological properties of a dataset that is based on the well-established theory of persistent homology. Graphcodes handle datasets that are filtered along two real-valued scale parameters. Such multi-parameter topological summaries are usually based on complicated theoretical foundations and difficult to compute; in contrast, graphcodes yield an informative and interpretable summary and can be computed as efficient as one-parameter summaries. Moreover, a graphcode is simply an embedded graph and can therefore be readily integrated in machine learning pipelines using graph neural networks. We describe such a pipeline and demonstrate that graphcodes achieve better classification accuracy than state-of-the-art approaches on various datasets.
Paper Structure (26 sections, 1 theorem, 12 equations, 5 figures, 4 tables)

This paper contains 26 sections, 1 theorem, 12 equations, 5 figures, 4 tables.

Key Result

Proposition B.1

If a matrix $A$ gets reduced to $R$ as above, and a column addition $c_j\gets c_i+c_j$ happens during the reduction process, then $c_i$ is a column of $R$.

Figures (5)

  • Figure 1: Schematic overview of our approach.
  • Figure 2: Left: A simplicial complex with $11$$0$-simplices, $19$$1$-simplices and $7$$2$-simplices. A $2$-chain consisting of three $2$-simplices is marked with darker color, and its boundary, a collection of $7$$1$-simplices is displayed in thick. Right: The $1$-cycle marked in thick on the left is also a $1$-boundary, since it is the image of the boundary operator under the $4$ marked $2$-simplices. On the right, the $1$-cycle $\alpha$ going along the path $ABCDE$ is not a $1$-boundary; therefore it is a generator of an homology class $[\alpha]$ of $H_1(K)$. Likewise, the $1$-cycle $\alpha'$ going along $ABCFGH$ is not a $1$-boundary neither. Furthermore, $[\alpha']=[\alpha]$ since the sum $\alpha+\alpha'$ is the $1$-cycle given by the path $AEDCFGH$, which is a $1$-boundary because of the $5$ marked $2$-simplices. Hence, $\alpha$ and $\alpha'$ represent the same homology class which is characterized by looping aroung the same hole in $K$.
  • Figure 3: Left, lower row: $Z_1(L)$ is generated by the cycles $abcd$ and $abd$. They form a barcode basis, with attached bars $[1,3)$ and $[2,2)$, respectively. Note that also $abd$ and $bcd$ form a basis of $Z_1(L)$, but that is not a barcode basis as none of these cycles is already born at $L_1$, so they do not induce a basis of $Z_1(L_1)$. Left, upper row: Here, $abd$ and $bcd$ form a barcode basis with attached bars $[0,2)$ and $[1,3)$, respectively, and $abd$ and $abcd$ as well (with identical barcode). Right: Choosing the basis $abcd$, $abd$ for $Z_1(L)$ and $abd$ and $bcd$ for $Z_1(K)$, we have $abcd=abd+bcd$, hence the cycle $abcd$ has two outgoing edges, to both basis elements in $K$. We ignore the basis vector $abd$ of $L$ in the figure, since its birth and death index coincide, so the corresponding feature has persistence zero.
  • Figure 4: Neural network architecture for graphcodes.
  • Figure :

Theorems & Definitions (6)

  • Definition 2.1
  • Proposition B.1
  • Example D.1
  • Example D.2
  • Definition D.3: Graphcode general
  • Example D.4