Table of Contents
Fetching ...

Modeling Changing Scientific Concepts with Complex Networks: A Case Study on the Chemical Revolution

Sofía Aguilar-Valdez, Stefania Degaetano-Ortlieb

Abstract

While context embeddings produced by LLMs can be used to estimate conceptual change, these representations are often not interpretable nor time-aware. Moreover, bias augmentation in historical data poses a non-trivial risk to researchers in the Digital Humanities. Hence, to model reliable concept trajectories in evolving scholarship, in this work we develop a framework that represents prototypical concepts through complex networks based on topics. Utilizing the Royal Society Corpus, we analyzed two competing theories from the Chemical Revolution (phlogiston vs. oxygen) as a case study to show that onomasiological change is linked to higher entropy and topological density, indicating increased diversity of ideas and connectivity effort.

Modeling Changing Scientific Concepts with Complex Networks: A Case Study on the Chemical Revolution

Abstract

While context embeddings produced by LLMs can be used to estimate conceptual change, these representations are often not interpretable nor time-aware. Moreover, bias augmentation in historical data poses a non-trivial risk to researchers in the Digital Humanities. Hence, to model reliable concept trajectories in evolving scholarship, in this work we develop a framework that represents prototypical concepts through complex networks based on topics. Utilizing the Royal Society Corpus, we analyzed two competing theories from the Chemical Revolution (phlogiston vs. oxygen) as a case study to show that onomasiological change is linked to higher entropy and topological density, indicating increased diversity of ideas and connectivity effort.
Paper Structure (20 sections, 1 equation, 6 figures)

This paper contains 20 sections, 1 equation, 6 figures.

Figures (6)

  • Figure 1: Diachronic prototypical concepts. This schematic illustrates movements in central and peripheral readings (arrows), and the presence of notable terms (ovals). Over time, air went from the core to the periphery while acid did the opposite, and this was accompanied by the removal of calx, phlogiston, and the emergence of oxygen.
  • Figure 1: Temporal graphs. The colors illustrate the number of communities which went on decline: starting at 5 (1750s) and ending at 3 (1800s).
  • Figure 2: Topic models evaluation. These results, produced by evaluating models for the 1800s non-cumulative corpus, were consistent across decades and strategies. Since both metrics degrade beyond 6 topics (↑perplexity, ↓coherence), the optimal number is 6.
  • Figure 2: Network metrics. We used five parameters to interpret network stability over time: nodes size, edge density (where $y=1\times10^{3}$, e.g., last number of edges reported is >60 000), communities count, modularity and percolation threshold.
  • Figure 3: Topic clusters. By comparing both relevant decades (1st row=1780s, 2nd row=1800s) across sampling strategies (1st column=cumulative, 2nd column=non-cumulative), we observe the cumulative sampling shows topic clusters with stable labels over time, contrary to the non-cumulative that provides more fine-grained representations.
  • ...and 1 more figures