Inference of dynamic hypergraph representations in temporal interaction data
Alec Kirkley
TL;DR
This work addresses how to represent temporal interaction data between two item categories as a sequence of temporal hypergraph snapshots by automatically selecting temporal windows using the $MDL$ principle. It proposes an $\mathcal{L}_{\text{total}}(\mathcal{X},\bm{\tau}) = \mathcal{L}_1 + \mathcal{L}_2 + \mathcal{L}_3$ encoding and solves for MDL-optimal hypergraph snapshots with an exact dynamic programming algorithm and a fast greedy method. Demonstrations on synthetic data show recovery of planted hypergraph structure under noise, and applications to NYC FourSquare checkins reveal meaningful, interpretable patterns of human mobility and activity localization. The approach provides a principled, data-driven framework for nonparametric summarization of high-order temporal interactions, with potential extensions to additional structural regularities and Bayesian formulations.
Abstract
A range of systems across the social and natural sciences generate datasets consisting of interactions between two distinct categories of items at various instances in time. Online shopping, for example, generates purchasing events of the form (user, product, time of purchase), and mutualistic interactions in plant-pollinator systems generate pollination events of the form (insect, plant, time of pollination). These data sets can be meaningfully modeled as temporal hypergraph snapshots in which multiple items within one category (i.e. online shoppers) share a hyperedge if they interacted with a common item in the other category (i.e. purchased the same product) within a given time window, allowing for the application of hypergraph analysis techniques. However, it is often unclear how to choose the number and duration of these temporal snapshots, which have a strong influence on the final hypergraph representations. Here we propose a principled nonparametric solution to this problem by extracting temporal hypergraph snapshots that optimally capture structural regularities in temporal event data according to the minimum description length principle. We demonstrate our methods on real and synthetic datasets, finding that they can recover planted artificial hypergraph structure in the presence of considerable noise and reveal meaningful activity fluctuations in human mobility data.
