Table of Contents
Fetching ...

Perplexity-Homophily Index: Homophily through Diversity in Hypergraphs

Gaurav Kumar, Akrati Saxena, Chandrakala Meena

TL;DR

The paper addresses measuring homophily in higher-order networks modeled as hypergraphs by introducing an edge-centric framework based on interaction perplexity $D(e)$ and a degree-aware baseline $B_{|e|}$. Homophily is quantified as a normalized diversity gap $\phi(e)=\frac{B_{|e|}-D(e)}{B_{|e|}-1}$ and aggregated into the Perplexity-Homophily Index $\Phi(H)=\frac{1}{|E|}\sum_{e\in E}\phi(e)$, with a $k$-uniform extension $\Phi(H_k)$ that connects to Newman's assortativity for $k=2$. The method is validated on synthetic and real-world hypergraphs, showing that $\Phi(H)$ captures the full distribution of homophily and reveals how homophilic and heterophilic tendencies vary with interaction size across domains such as shopping, politics, and education. This framework offers a flexible, interpretable, and comparable measure for higher-order homophily and lays the groundwork for temporal, multilayer, and model-based extensions in complex systems.

Abstract

Real-world complex systems are often better modeled as hypergraphs, where edges represent group interactions involving multiple entities. Understanding and quantifying homophily (similarity-driven association) in such networks is essential for analyzing community formation and information flow. We propose a hyperedge-centric framework to quantify homophily in hypergraphs. Each interaction is represented as a hyperedge, and its interaction perplexity measures the effective number of distinct attributes it contains. Comparing this observed perplexity with a degree-preserving random baseline defines the diversity gap, which quantifies how diverse an interaction is than expected by chance. The global homophily score for a network, called Perplexity-Homophily Index, is computed by averaging the normalized diversity gap across all hyperedges. Experiments on synthetic and real-world datasets show that the proposed index captures the full distribution of homophily and reveals how homophilic and heterophilic tendencies vary with interaction size in hypergraphs.

Perplexity-Homophily Index: Homophily through Diversity in Hypergraphs

TL;DR

The paper addresses measuring homophily in higher-order networks modeled as hypergraphs by introducing an edge-centric framework based on interaction perplexity and a degree-aware baseline . Homophily is quantified as a normalized diversity gap and aggregated into the Perplexity-Homophily Index , with a -uniform extension that connects to Newman's assortativity for . The method is validated on synthetic and real-world hypergraphs, showing that captures the full distribution of homophily and reveals how homophilic and heterophilic tendencies vary with interaction size across domains such as shopping, politics, and education. This framework offers a flexible, interpretable, and comparable measure for higher-order homophily and lays the groundwork for temporal, multilayer, and model-based extensions in complex systems.

Abstract

Real-world complex systems are often better modeled as hypergraphs, where edges represent group interactions involving multiple entities. Understanding and quantifying homophily (similarity-driven association) in such networks is essential for analyzing community formation and information flow. We propose a hyperedge-centric framework to quantify homophily in hypergraphs. Each interaction is represented as a hyperedge, and its interaction perplexity measures the effective number of distinct attributes it contains. Comparing this observed perplexity with a degree-preserving random baseline defines the diversity gap, which quantifies how diverse an interaction is than expected by chance. The global homophily score for a network, called Perplexity-Homophily Index, is computed by averaging the normalized diversity gap across all hyperedges. Experiments on synthetic and real-world datasets show that the proposed index captures the full distribution of homophily and reveals how homophilic and heterophilic tendencies vary with interaction size in hypergraphs.

Paper Structure

This paper contains 7 sections, 6 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Perplexity $D(e)$ for a hyperedge with two attributes $A$ and $B$.
  • Figure 2: (a). Homophily vs. $p$ for $10$-uniform hypergraph with $1000$ nodes and $10$ evenly distributed attributes, and (b). Comparison of homophily scores for $k$-uniform hypergraphs with varying $p$ on $1000$ nodes network with $10$ evenly distributed attributes.
  • Figure 3: $\Phi(H_k)$ vs. $k$ across all datasets.
  • Figure 4: Perplexity vs. hyperedge size ($k$).