Table of Contents
Fetching ...

Persistent geographical biases in global scientific collaboration and citations

Leyan Wu, Yong Huang, Wei Lu, Akrati Saxena, Vincent Traag

Abstract

Scientific knowledge flows enable cumulative progress by connecting researchers across disciplines, institutions, and countries. Yet it remains unclear how geography and national structures continue to shape these exchanges in an increasingly connected world. Using a large-scale bibliometric dataset from OpenAlex, which covers 39.35 million publications across 95 countries and 3,794 cities between 2000 and 2022, we examine global knowledge diffusion through two complementary channels: co-authorship and citation. We find that the constraining effect of geographic distance on collaboration has not diminished over time but has instead intensified, suggesting persistent structural or institutional barriers. Citation flows, by contrast, are less sensitive to spatial proximity, indicating that intellectual influence may diffuse more freely across borders. At the country level, research networks exhibit strong domestic preferences and a shared citation orientation toward the United States. China, while increasingly favored as a collaboration partner by other countries, continues to be systematically undercited within global citation flows. International mobility increases researchers' collaboration with scholars in their host country but has limited effects on citation flows. These results highlight the structural persistence of spatial and country biases in global science, with implications for equitable participation and recognition across regions.

Persistent geographical biases in global scientific collaboration and citations

Abstract

Scientific knowledge flows enable cumulative progress by connecting researchers across disciplines, institutions, and countries. Yet it remains unclear how geography and national structures continue to shape these exchanges in an increasingly connected world. Using a large-scale bibliometric dataset from OpenAlex, which covers 39.35 million publications across 95 countries and 3,794 cities between 2000 and 2022, we examine global knowledge diffusion through two complementary channels: co-authorship and citation. We find that the constraining effect of geographic distance on collaboration has not diminished over time but has instead intensified, suggesting persistent structural or institutional barriers. Citation flows, by contrast, are less sensitive to spatial proximity, indicating that intellectual influence may diffuse more freely across borders. At the country level, research networks exhibit strong domestic preferences and a shared citation orientation toward the United States. China, while increasingly favored as a collaboration partner by other countries, continues to be systematically undercited within global citation flows. International mobility increases researchers' collaboration with scholars in their host country but has limited effects on citation flows. These results highlight the structural persistence of spatial and country biases in global science, with implications for equitable participation and recognition across regions.

Paper Structure

This paper contains 15 sections, 9 equations, 19 figures, 3 tables.

Figures (19)

  • Figure 1: A simple directed acyclic graph (DAG) to illustrate the core causal assumptions to identify potential biases of the country. In particular, we are interested in the direct causal effect of the country on collaborations. Note that the influence of scientific fields is not illustrated in this DAG, but it is assumed to act simply as a confounder between the city and the collaboration. Although this DAG only illustrates the assumptions for collaborations, a DAG for citations would be highly similar, except that time plays an additional complication.
  • Figure 2: Overall trends in collaboration and citation metrics.a, Probability of the existence of a link as a function of the geographic distance between two cities in the collaboration network. b, Heatmap shows the distribution of the ratio of the link weight and the product of the strengths of its endpoints $w_{ij}^{\mathrm{Col}} / (s_i s_j)$ in the collaboration network against the geographic distance $d_{ij}$ between cities; scatter points and lines represent the average ratio per distance bin. c, The trends shown in a, separated by collaboration type into domestic and international. d, The trends shown in b, separated by collaboration type into domestic and international. e, Probability of the existence of a link as a function of the geographic distance between two cities in the citation network. f, Heatmap shows the distribution of the ratio of the link weight and the product of the strengths of its endpoints $w_{ij}^{\mathrm{Cite}} / (s_i s_j)$ in the citation network against the geographic distance $d_{ij}$ between cities; scatter points and lines represent the average ratio per distance bin. g, The trends shown in e, separated by collaboration type into domestic and international. h, The trends shown in f, separated by collaboration type into domestic and international.
  • Figure 3: Impact of geographic distance on international collaboration and citation across disciplines.a, Temporal trend of the effect of geographic distance on international collaboration, as estimated by our statistical model, controlling for publication volume and country-level preferences. The black line represents the average trend across all disciplines, while colored lines denote individual fields. b, Analogous results for international citations. The shading around the trends denotes the confidence intervals of the estimated coefficients.
  • Figure 4: International collaboration and citation preferences between countries.a, Overall collaboration preference among six selected countries from 2000 to 2022. b, Overall citation preference among the same countries and years. c, Annual trends of collaboration preference of all countries toward China and the United States. d, Annual trends of citation preference of all countries toward China and the United States. Both heatmaps display bilateral preferences across all disciplines, where lighter pink indicates lower preference and deeper blue indicates stronger collaboration or citation strength. The shaded areas of the lines represent the 95% confidence intervals of the means of data in each year.
  • Figure 5: Collaboration and citation preference (US-CN).a, The evolution of collaboration preference between the US and China over time. Collaboration preference is undirected. b, The change in citation preference between the two countries over time. Citation preference is directed: "CN-US" refers to Chinese papers citing US papers, while "US-CN" refers to US papers citing Chinese papers. Shaded areas in panels a and b represent the 95% confidence intervals of yearly means. c, Field-specific total collaboration preference, aggregated from panel a. d, Field-specific total citation preference, aggregated from panel b.
  • ...and 14 more figures