Collaboration and topic switches in science

Sara Venturini; Satyaki Sikdar; Francesco Rinaldi; Francesco Tudisco; Santo Fortunato

Collaboration and topic switches in science

Sara Venturini, Satyaki Sikdar, Francesco Rinaldi, Francesco Tudisco, Santo Fortunato

TL;DR

This paper studies how collaboration patterns shape topic switches in science using a two-window design with an interaction window ($IW$) and an activation window ($AW$) on the OpenAlex dataset across 20 topics. It reveals that the probability that an inactive scholar adopts a new topic, $C(k)$, increases with $k$, the number of contacts with active coauthors, and that coauthor contributions interact non-independently, not fitting a baseline of independent per-contact effects. The second experiment shows that active authors with higher productivity or impact exert stronger influence on their exclusive inactive coauthors, increasing their topic-switch probability and exhibiting a chaperoning tendency that scales with author prominence. Together, these results highlight how both selection (homophily) and social influence in collaboration networks steer future research directions, with implications for forecasting science trajectories.

Abstract

Collaboration is a key driver of science and innovation. Mainly motivated by the need to leverage different capacities and expertise to solve a scientific problem, collaboration is also an excellent source of information about the future behavior of scholars. In particular, it allows us to infer the likelihood that scientists choose future research directions via the intertwined mechanisms of selection and social influence. Here we thoroughly investigate the interplay between collaboration and topic switches. We find that the probability for a scholar to start working on a new topic increases with the number of previous collaborators, with a pattern showing that the effects of individual collaborators are not independent. The higher the productivity and the impact of authors, the more likely their coworkers will start working on new topics. The average number of coauthors per paper is also inversely related to the topic switch probability, suggesting a dilution of this effect as the number of collaborators increases.

Collaboration and topic switches in science

TL;DR

This paper studies how collaboration patterns shape topic switches in science using a two-window design with an interaction window (

) and an activation window (

) on the OpenAlex dataset across 20 topics. It reveals that the probability that an inactive scholar adopts a new topic,

, increases with

, the number of contacts with active coauthors, and that coauthor contributions interact non-independently, not fitting a baseline of independent per-contact effects. The second experiment shows that active authors with higher productivity or impact exert stronger influence on their exclusive inactive coauthors, increasing their topic-switch probability and exhibiting a chaperoning tendency that scales with author prominence. Together, these results highlight how both selection (homophily) and social influence in collaboration networks steer future research directions, with implications for forecasting science trajectories.

Abstract

Paper Structure (10 sections, 6 equations, 9 figures, 2 tables)

This paper contains 10 sections, 6 equations, 9 figures, 2 tables.

Experiment I
Experiment II
Data
Overlap coefficient
Author ranking metrics
Statistical test for difference of samples
Target activation probability
Simple baseline for membership closure
Source activation probability
Chaperoning propensity

Figures (9)

Figure 1: Schematic setup for our analysis. (A) Stream of papers across interaction (IW) and activation (AW) windows. Papers tagged with the focal topic $t$ are marked in red. (B) Author collaboration graph at the end of IW. Authors $a_i$ and $a_j$ are linked by an edge of weight $k$ if $a_i$ coauthored $k$ papers with $a_j$ within the IW. The authors active in the focal topic by the end of IW are marked in red. (C) Focus: inactive authors. Inactive author $a_6$ has four active contacts from three sources {$a_0$, $a_1$, $a_5$} derived from the collaboration graph in (B). (D) Focus: active authors. Active author $a_0$ has four coauthors {$a_1$, $a_2$, $a_3$, $a_6$}, of whom $a_1$ is already active, and $a_6$ also collaborated with $a_1$ in the IW. This leaves the subset of exclusive inactive coauthors $\{a_2, a_3\}$. Within this subset, only $a_2$ becomes active in the AW, resulting in $a_0$'s source activation probability of $\tfrac{1}{2}=0.50$. Additionally, $a_2$ writes their first paper with $a_0$ in the AW.
Figure 2: Experiment I. Cumulative target activation probability (in purple) for inactive authors in the AW with shaded 95% confidence intervals. For each $k$, the $y$-value indicates the fraction of inactive authors with at least $k$ active contacts in the IW who became active in the AW. The green solid line with shaded errors represents the baseline described in the text, corresponding to independent effects from the coauthors. The heatmap below the $x$-axis shows the mean difference between the observed and baseline curves for each $k$ value. It is gray if the 95% confidence interval contains 0, denoting the $k$-values where the points are statistically indistinguishable at $p$-value $0.05$. Positive and negative deviations from the baseline are in red and blue, respectively.
Figure 3: Heatmaps showing the mean difference between the cumulative target activation probabilities of the inactive authors in the AW who had exclusive contacts with the top 10% and bottom 10% of active authors, respectively, selected according to productivity (left) and impact (right) in the IW. The cells are gray if the 95% confidence interval contains 0. The majority of red cells indicate that the cumulative target activation probabilities for contacts with the top 10% are higher than those with the bottom 10%.
Figure 4: Experiment II results for $f^\ast = 0.10$. (A) The mean and 95% confidence interval of the means of the difference between the cumulative source activations of active authors in the top 10% and bottom 10% based on productivity (green) and impact (pink). (B) The mean and 95% confidence interval of the means of the difference between the chaperoning propensities of active authors in the top 10% and bottom 10% based on productivity (green) and impact (pink). A positive difference indicates that the effect is stronger for the top 10% active authors.
Figure 5: Dilution effect results for $f^\ast = 0.10$. The mean and 95% confidence interval of the mean of the difference between the cumulative source activations of active authors in the top 20% and bottom 20% bins, based on the average number of coauthors, among the top 10% active authors in productivity (green) and impact (pink). A negative difference across the topics indicates a dilution effect, wherein coauthors of prominent active scholars with fewer collaborators are more likely to switch topics.
...and 4 more figures

Collaboration and topic switches in science

TL;DR

Abstract

Collaboration and topic switches in science

TL;DR

Abstract

Table of Contents

Figures (9)