Table of Contents
Fetching ...

Beyond Linearity and Time-Homogeneity: Relational Hyper Event Models with Time-Varying Non-Linear Effects

Martina Boschi, Jürgen Lerner, Ernst C. Wit

TL;DR

The paper tackles the limitation of linearity in Relational Hyperevent Models by introducing jointly time-varying non-linear effects (TVNLE) using tensor-product smooths. It extends REMs to relational hyperevents with flexible covariate influences, provides a penalized logistic partial-likelihood inference framework, and demonstrates through simulations that TVNLE recovers linear, time-varying, and non-linear dynamics, while reducing overfitting via smoothing. The empirical application to the DBLP coauthorship-citation network reveals non-monotone and time-varying patterns (e.g., author self-citation, citation similarity) that linear models miss, underscoring the method’s ability to uncover complex diffusion and collaboration dynamics over eight decades. Overall, the approach enriches the modeling toolbox for dynamic hypergraphs, enabling deeper insight into how high-order interactions evolve in time.

Abstract

Recent technological advances have made it easier to collect large and complex networks of time-stamped relational events connecting two or more entities. Relational hyper-event models (RHEMs) aim to explain the dynamics of these events by modeling the event rate as a function of statistics based on past history and external information. However, despite the complexity of the data, most current RHEM approaches still rely on a linearity assumption to model this relationship. In this work, we address this limitation by introducing a more flexible model that allows the effects of statistics to vary non-linearly and over time. While time-varying and non-linear effects have been used in relational event modeling, we take this further by modeling joint time-varying and non-linear effects using tensor product smooths. We validate our methodology on both synthetic and empirical data. In particular, we use RHEMs to study how patterns of scientific collaboration and impact evolve over time. Our approach provides deeper insights into the dynamic factors driving relational hyper-events, allowing us to evaluate potential non-monotonic patterns that cannot be identified using linear models.

Beyond Linearity and Time-Homogeneity: Relational Hyper Event Models with Time-Varying Non-Linear Effects

TL;DR

The paper tackles the limitation of linearity in Relational Hyperevent Models by introducing jointly time-varying non-linear effects (TVNLE) using tensor-product smooths. It extends REMs to relational hyperevents with flexible covariate influences, provides a penalized logistic partial-likelihood inference framework, and demonstrates through simulations that TVNLE recovers linear, time-varying, and non-linear dynamics, while reducing overfitting via smoothing. The empirical application to the DBLP coauthorship-citation network reveals non-monotone and time-varying patterns (e.g., author self-citation, citation similarity) that linear models miss, underscoring the method’s ability to uncover complex diffusion and collaboration dynamics over eight decades. Overall, the approach enriches the modeling toolbox for dynamic hypergraphs, enabling deeper insight into how high-order interactions evolve in time.

Abstract

Recent technological advances have made it easier to collect large and complex networks of time-stamped relational events connecting two or more entities. Relational hyper-event models (RHEMs) aim to explain the dynamics of these events by modeling the event rate as a function of statistics based on past history and external information. However, despite the complexity of the data, most current RHEM approaches still rely on a linearity assumption to model this relationship. In this work, we address this limitation by introducing a more flexible model that allows the effects of statistics to vary non-linearly and over time. While time-varying and non-linear effects have been used in relational event modeling, we take this further by modeling joint time-varying and non-linear effects using tensor product smooths. We validate our methodology on both synthetic and empirical data. In particular, we use RHEMs to study how patterns of scientific collaboration and impact evolve over time. Our approach provides deeper insights into the dynamic factors driving relational hyper-events, allowing us to evaluate potential non-monotonic patterns that cannot be identified using linear models.

Paper Structure

This paper contains 31 sections, 16 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Motivating example: publication events. This picture represent the first three events of a synthetic relational hyperevent sequence of paper publications. Each directed hyperevent consists of a paper, written by a team of authors (senders) and citing a set of references (receivers). Each hyperevent comes with a time-stamp, reporting the publication date. Only when the paper is published, it can be cited. This is why published paper appears as potential paper in the following time stamp. Furthermore, each actor is associated with an score (S), while each paper has a numeric impact associated (I).
  • Figure 2: Two hypothetical effects in the synthetic motivating example. a) The solid curve represents the estimated time‐varying coefficient $\beta(t)$ of the Average Author Score (AAS), entering the model through the linear effect $\beta(t)\, x^{\text{AAS}}$. Together, these determine the contribution to the log-hazard. During the 1950s, authors with a high average score tended to publish less frequently. Toward the late 1980s this pattern reversed, with high-scoring authors publishing more often, although this tendency subsequently weakened. b) The solid curve represents the contribution $f(x^{\text{RPI}})$ of the total Referenced Papers’ Impact (RPI) to the log-hazard. Initially, higher total RPI increases the likelihood of a publication event. However, when the total impact becomes too large, the contribution decreases, indicating a reduced likelihood of publication.
  • Figure 3: Jointly time-varying non-linear effects of covariates in the synthetic motivating example. In both heatmaps, the x-axis represents time; fixing a point on the y-axis allows us to examine how the effect evolves over time. Since the y-axis represents covariate values, fixing a point on the x-axis shows the non-linear contribution of the covariate. Lighter (yellow) areas indicate a larger contribution to the log-hazard than darker (blue) areas. Gray points are values not observed in the data. Contour lines trace regions of constant effect; closer contours indicate faster changes. a) The heatmap shows that the contribution of the Average Author Score to the log-hazard varies non-monotonically over time and across covariate values. For higher values of AAS (approximately 1.7–1.85), the effect first decreases, then increases, and finally weakens again. This reflects the behavior observed in Figure \ref{['fig:example_time_varying']} a). At the beginning and end of the observation period (approximately 1950–1975 and 2000–2021), the trend is clearly non-monotonic, peaking around a value of 1.6. Between 1975 and 2000, the trend is inverted, favoring a higher average author score. b) The heatmap shows the contribution of the total Referenced Papers' Impact. Although the effect is non-linear (peaking around a value of 3), as in Figure \ref{['fig:example_time_varying']} b), it remains relatively stable over time. This is visible from the horizontal contour lines: fixing a point on the y-axis reveals little temporal variation.
  • Figure 4: Observed hyperevent and candidate non-hyperevents in the synthetic motivating example. a) Observed hyperevent. This corresponds to the first hyperevent shown in Figure \ref{['fig:example_introduction']} and listed in Table \ref{['tab:hyperevents']}, where authors 2 and 3 publish article C citing papers A and B. To compute the likelihood in Equation \ref{['eq_partial_likelihood']}, a non-hyperevent must be sampled for comparison. Panels b), c), and d) illustrate possible candidates. In the empirical application, non-hyperevents are sampled to have the same cardinality as the observed hyperevent. Under this criterion, only the non-hyperevent in panel b) is a valid candidate among those shown.
  • Figure 5: Linear models are recovered by penalized non-linear models. Panels a), b), and c) display results based on synthetic data generated according to the model in Equation \ref{['eq_RG1']}, differing only in the number $n$ of simulated events. Panels a) and b) show estimates from linear and non-linear models, while panel c) presents results from a model incorporating a joint time-varying and non-linear effect. Top. Across experiment replications, estimates are aggregated using inverse variance weighting. For the linear model, this yields a consensus slope referred to as the "consensus linear effect" (blue dashed line). For non-linear models, predicted effects are interpolated across the covariate domain, and aggregated pointwise using inverse variance weights, resulting in the "consensus non-linear effect" (black dashed line). Since non-linear estimates are identifiable only up to an additive constant, the consensus curve is manually shifted for alignment. Confidence intervals, derived from the 5th and 95th percentiles of the empirical interpolated estimates at each covariate value, are also shifted accordingly. Both linear and non-linear estimated effects are compared to the ground truth (red solid line). Bottom. For the TVNLE model, estimates are aggregated using inverse variance weighting. The resulting smooth surface is then centered such that, at each time point, the average effect across the covariate domain is equal to $0$.
  • ...and 5 more figures