Table of Contents
Fetching ...

Time to Cite: Modeling Citation Networks using the Dynamic Impact Single-Event Embedding Model

Nikolaos Nakis, Abdulkadir Celikkanat, Louis Boucherie, Sune Lehmann, Morten Mørup

TL;DR

The paper addresses modeling citation networks as single-event, time-stamped interactions while capturing both dynamic impact and latent structure. It introduces the Single-Event Poisson Process (SE-PP) and the Dynamic Impact Single-Event Embedding Model (DISEE), combining a latent distance model with time-varying paper masses driven by parametric impact functions such as $f_i(t)$. Its contributions include deriving SE-PP, formulating DISEE with mass dynamics $\exp(\alpha_i)\exp(\beta_j)$ and latent distances, and empirically validating that DISEE achieves competitive or superior link-prediction performance while yielding interpretable paper lifecycles. The work provides a principled statistical framework for SENs and sets the stage for inductive extensions, such as GNN-based embeddings for unseen papers.

Abstract

Understanding the structure and dynamics of scientific research, i.e., the science of science (SciSci), has become an important area of research in order to address imminent questions including how scholars interact to advance science, how disciplines are related and evolve, and how research impact can be quantified and predicted. Central to the study of SciSci has been the analysis of citation networks. Here, two prominent modeling methodologies have been employed: one is to assess the citation impact dynamics of papers using parametric distributions, and the other is to embed the citation networks in a latent space optimal for characterizing the static relations between papers in terms of their citations. Interestingly, citation networks are a prominent example of single-event dynamic networks, i.e., networks for which each dyad only has a single event (i.e., the point in time of citation). We presently propose a novel likelihood function for the characterization of such single-event networks. Using this likelihood, we propose the Dynamic Impact Single-Event Embedding model (DISEE). The \textsc{\modelabbrev} model characterizes the scientific interactions in terms of a latent distance model in which random effects account for citation heterogeneity while the time-varying impact is characterized using existing parametric representations for assessment of dynamic impact. We highlight the proposed approach on several real citation networks finding that the DISEE well reconciles static latent distance network embedding approaches with classical dynamic impact assessments.

Time to Cite: Modeling Citation Networks using the Dynamic Impact Single-Event Embedding Model

TL;DR

The paper addresses modeling citation networks as single-event, time-stamped interactions while capturing both dynamic impact and latent structure. It introduces the Single-Event Poisson Process (SE-PP) and the Dynamic Impact Single-Event Embedding Model (DISEE), combining a latent distance model with time-varying paper masses driven by parametric impact functions such as . Its contributions include deriving SE-PP, formulating DISEE with mass dynamics and latent distances, and empirically validating that DISEE achieves competitive or superior link-prediction performance while yielding interpretable paper lifecycles. The work provides a principled statistical framework for SENs and sets the stage for inductive extensions, such as GNN-based embeddings for unseen papers.

Abstract

Understanding the structure and dynamics of scientific research, i.e., the science of science (SciSci), has become an important area of research in order to address imminent questions including how scholars interact to advance science, how disciplines are related and evolve, and how research impact can be quantified and predicted. Central to the study of SciSci has been the analysis of citation networks. Here, two prominent modeling methodologies have been employed: one is to assess the citation impact dynamics of papers using parametric distributions, and the other is to embed the citation networks in a latent space optimal for characterizing the static relations between papers in terms of their citations. Interestingly, citation networks are a prominent example of single-event dynamic networks, i.e., networks for which each dyad only has a single event (i.e., the point in time of citation). We presently propose a novel likelihood function for the characterization of such single-event networks. Using this likelihood, we propose the Dynamic Impact Single-Event Embedding model (DISEE). The \textsc{\modelabbrev} model characterizes the scientific interactions in terms of a latent distance model in which random effects account for citation heterogeneity while the time-varying impact is characterized using existing parametric representations for assessment of dynamic impact. We highlight the proposed approach on several real citation networks finding that the DISEE well reconciles static latent distance network embedding approaches with classical dynamic impact assessments.
Paper Structure (6 sections, 11 equations, 6 figures, 3 tables)

This paper contains 6 sections, 11 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Examples of three different types of networks based on their temporal structure. Round points represent network nodes, square points make up the corresponding colored node dyads, arrows represent directed relationships between two nodes, vertical lines represent events, and black lines are the timelines while grey bold lines show that a link (event) appeared once and cannot be observed again. Left panel: Static networks where links occur once and there is no temporal information available. Middle panel: Temporal networks where links are events in time and can be observed multiple times along the timeline. Right panel: Single-event networks (SENs) where links appear in a temporal manner but can occur only once for each dyad, defining edges as single events.
  • Figure 2: DISEE procedure overview. The model defines for the SE-PP an intensity function introducing two sets of static embeddings distinguishing between source $\mathbf{w}_u$ and target $\mathbf{z}_v$ node embeddings. Furthermore, each node is assigned its own random effect, distinguishing again the source $\beta_u$ and target $\alpha_v$ roles. The random effects can be parameterized to represent source and target masses through the exponential function. Finally, for each target node of the network, the model defines an impact function $f_v(t)$ yielding a temporal impact characterization of the nodes' link dynamics, which controls the nodes' time-varying mass as $\exp{(\alpha_v)}f_v(t)$.
  • Figure 3: Artificial: Comparison of the inferred impact functions, generated by a DISEE Mixture Model with three Truncated normal distributions, to the true citation histogram of two papers from the Art dataset.
  • Figure 4: Machine Learning: DISEE and FI-DISEE models Truncated normal, and Log-Normal inferred impact function visualizations compared to the true citation histogram for four highly cited ML papers.
  • Figure 5: Machine Learning: DISEE 2-dimensional embedding space Log-Normal yearly evolution. Node sizes are based on each paper's mass, $f_i(t)\exp{(\alpha_i)}$. Nodes are color-coded based on their publication year.
  • ...and 1 more figures