Table of Contents
Fetching ...

Hypercurveball algorithm for sampling hypergraphs with fixed degrees

Yanna J. Kraakman, Clara Stegehuis

TL;DR

The paper extends Curveball to hypergraphs by introducing the Hypercurveball algorithm, a node-incidence-trade Markov chain that preserves fixed degree sequences for both undirected and directed hypergraphs. It establishes uniform-sampling guarantees for several hypergraph spaces and identifies cases where bias can arise, along with proofs of non-equivalence to hyperedge-shuffle methods in others. Through extensive experiments on real and artificial datasets, it demonstrates polynomial scaling of mixing times and shows the relative speed of Hypercurveball versus hyperedge-shuffle depends on degree statistics, notably the comparison between node- and hyperedge-minimum degree expectations. The work broadens null-modeling tools for higher-order networks and provides practical guidance on when to prefer Hypercurveball or hyperedge-shuffle, while also outlining open avenues for stronger uniformity results and more powerful trades. Overall, the Hypercurveball framework offers a rigorous, scalable approach to sampling fixed-degree hypergraphs with directed and undirected variants and has important implications for higher-order network analysis and hypothesis testing.

Abstract

Comparative analysis between a network and a random graph model can uncover network properties that significantly deviate from those in random networks. The standard random graph model used for comparison uniformly samples random graphs with the same degrees as the network data, often achieved through edge-swap algorithms. However, for hypergraphs, fewer such methodologies are available. This study introduces the Hypercurveball algorithm, designed to sample random, potentially directed, hypergraphs with fixed degrees. Minor adjustments enable the sampling of hypergraphs without degenerate hyperedges, self-loops, or multi-hyperedges. For most of these algorithms, we prove whether they sample uniformly or with bias. We experimentally show that the Hypercurveball algorithm can be significantly faster or slower than the standard hyperedge-shuffling algorithm, which is the hyperedge-equivalent of the edge-swap algorithm. We present criteria on the hypergraph degree sequence that indicate when the Hypercurveball algorithm is more efficient than the standard hyperedge-shuffling method. Finally, our experimental results suggest polynomial scaling of the mixing time for both the Hypercurveball and hyperedge-shuffling algorithms.

Hypercurveball algorithm for sampling hypergraphs with fixed degrees

TL;DR

The paper extends Curveball to hypergraphs by introducing the Hypercurveball algorithm, a node-incidence-trade Markov chain that preserves fixed degree sequences for both undirected and directed hypergraphs. It establishes uniform-sampling guarantees for several hypergraph spaces and identifies cases where bias can arise, along with proofs of non-equivalence to hyperedge-shuffle methods in others. Through extensive experiments on real and artificial datasets, it demonstrates polynomial scaling of mixing times and shows the relative speed of Hypercurveball versus hyperedge-shuffle depends on degree statistics, notably the comparison between node- and hyperedge-minimum degree expectations. The work broadens null-modeling tools for higher-order networks and provides practical guidance on when to prefer Hypercurveball or hyperedge-shuffle, while also outlining open avenues for stronger uniformity results and more powerful trades. Overall, the Hypercurveball framework offers a rigorous, scalable approach to sampling fixed-degree hypergraphs with directed and undirected variants and has important implications for higher-order network analysis and hypothesis testing.

Abstract

Comparative analysis between a network and a random graph model can uncover network properties that significantly deviate from those in random networks. The standard random graph model used for comparison uniformly samples random graphs with the same degrees as the network data, often achieved through edge-swap algorithms. However, for hypergraphs, fewer such methodologies are available. This study introduces the Hypercurveball algorithm, designed to sample random, potentially directed, hypergraphs with fixed degrees. Minor adjustments enable the sampling of hypergraphs without degenerate hyperedges, self-loops, or multi-hyperedges. For most of these algorithms, we prove whether they sample uniformly or with bias. We experimentally show that the Hypercurveball algorithm can be significantly faster or slower than the standard hyperedge-shuffling algorithm, which is the hyperedge-equivalent of the edge-swap algorithm. We present criteria on the hypergraph degree sequence that indicate when the Hypercurveball algorithm is more efficient than the standard hyperedge-shuffling method. Finally, our experimental results suggest polynomial scaling of the mixing time for both the Hypercurveball and hyperedge-shuffling algorithms.

Paper Structure

This paper contains 30 sections, 12 theorems, 37 equations, 16 figures, 10 tables, 5 algorithms.

Key Result

Theorem 1

The Hypercurveball algorithm samples uniformly from for any undirected hypergraph degree sequence $\vb*{d}_1$ and directed hypergraph degree sequence $\vb*{d}_2$.

Figures (16)

  • Figure 1: A directed hypergraph with node set $V=\{a,b,c,d,e,f\}$ and hyperedge set $E=\{e_1,e_2,e_3,e_4,e_5\}$, where $e_1=(\{a,d\},\{a,b\}), e_2=(\{d,d\},\{e\}),e_3=(\{b\},\{c\}), e_4=(\{b\},\{c\})$ and $e_5=(\{c,f\},\{c,f\})$. Hyperedges $e_3$ and $e_4$ are multi-hyperedges, hyperedge $e_2$ is a degenerate hyperedge and hyperedge $e_5$ is a self-loop. The incidence sets of the nodes are $I_a = (\{e_1\},\{e_1\}), I_b = (\{e_3,e_4\},\{e_1\}), I_c = (\{e_5\},\{e_3,e_4,e_5\}), I_d= (\{e_1,e_2,e_2\},\emptyset),I_e = (\emptyset,\{e_2\})$ and $I_f = (\{e_5\},\{e_5\}).$
  • Figure 2: Changing the incidence sets of a hypergraph also changes the edge set.
  • Figure 3: Performing hypertrade($c,e$) on $H_1$ can result in the hypergraph $H_1,H_2,H_3$ or $H_4$.
  • Figure 4: Performing a hyperedge-shuffle on the hyperedges $e_2$ and $e_3$ in $H_1$ (same as in Figure \ref{['fig:example_trade']}) can result in the hypergraph $H_1,H_2,H_5,H_6,H_7$ or $H_8$.
  • Figure 5: Two directed hypergraphs $H_1,H_2 \in (\mathcal{H}_{d,m}(\vb*{d}^*) \cap \mathcal{H}_{d}(\vb*{d}^*))$ which are connected using hyperedge-shuffles, and not connected using hypertrades in either space $\mathcal{H}_{d,m}(\vb*{d}^*)$ or $\mathcal{H}_d(\vb*{d}^*)$.
  • ...and 11 more figures

Theorems & Definitions (22)

  • Definition 2.1: Hypergraph space kraakman2024
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Definition 4.1: Perturbation degree
  • Lemma 7.1
  • proof
  • Lemma 7.2
  • proof
  • ...and 12 more