Table of Contents
Fetching ...

Robust Evidence for Declining Disruptiveness: Assessing the Role of Zero-Backward-Citation Works

Michael Park, Erin Leahey, Russell J. Funk

TL;DR

Park et al. defend the robustness of a declining disruptiveness signal against Holst et al.'s critique that zero-backward-citation works drive the trend. They replicate analyses using Holst et al.'s dataset and disruption metric, apply their regression controls, and still observe substantial declines in $CD_5$ for both papers and patents, comparable to major transformations in science. Through Monte Carlo simulations, the $CD_5^{noK}$ variant, and alternative disruption measures, they show the decline is not an artifact of 0-bcite inclusion or dataset quality, though they also highlight critical data-quality issues in SciSciNet that may inflate 0-bcite counts. The authors further argue that wholesale exclusion of 0-bcite works risks omitting radical innovations, and they provide theoretical and empirical evidence that the observed temporal decline reflects substantive shifts in scientific and technological innovation patterns with practical implications for scientometric methods.

Abstract

We respond to Holst et al.'s (HATWG) critique that the observed decline in scientific disruptiveness demonstrated in Park et al. (PLF) stems from including works with zero backward citations (0-bcites). Applying their own advocated dataset, metric, and exclusion criteria, we demonstrate statistically and practically significant declines in disruptiveness that equal major benchmark transformations in science. Notably, we show that HATWG's own regression model -- designed specifically to address their concerns about 0-bcite works -- reveals highly significant declines for both papers (p<0.001) and patents (p<0.001), a finding they neither acknowledge nor interpret. Their critique is undermined by methodological deficiencies, including reliance on visual inspection without statistical assessment, and severe data quality issues in their SciSciNet dataset, which contains nearly three times more 0-bcite papers than our original data. HATWG's departure from established scientometric practices -- notably their inclusion of document types and fields known for poor metadata quality -- invalidates their conclusions. Monte Carlo simulations and additional analyses using multiple disruptiveness measures across datasets further validate the robustness of the declining trend. Our findings collectively demonstrate that the observed decline in disruptiveness is not an artifact of 0-bcite works but represents a substantive change in scientific and technological innovation patterns.

Robust Evidence for Declining Disruptiveness: Assessing the Role of Zero-Backward-Citation Works

TL;DR

Park et al. defend the robustness of a declining disruptiveness signal against Holst et al.'s critique that zero-backward-citation works drive the trend. They replicate analyses using Holst et al.'s dataset and disruption metric, apply their regression controls, and still observe substantial declines in for both papers and patents, comparable to major transformations in science. Through Monte Carlo simulations, the variant, and alternative disruption measures, they show the decline is not an artifact of 0-bcite inclusion or dataset quality, though they also highlight critical data-quality issues in SciSciNet that may inflate 0-bcite counts. The authors further argue that wholesale exclusion of 0-bcite works risks omitting radical innovations, and they provide theoretical and empirical evidence that the observed temporal decline reflects substantive shifts in scientific and technological innovation patterns with practical implications for scientometric methods.

Abstract

We respond to Holst et al.'s (HATWG) critique that the observed decline in scientific disruptiveness demonstrated in Park et al. (PLF) stems from including works with zero backward citations (0-bcites). Applying their own advocated dataset, metric, and exclusion criteria, we demonstrate statistically and practically significant declines in disruptiveness that equal major benchmark transformations in science. Notably, we show that HATWG's own regression model -- designed specifically to address their concerns about 0-bcite works -- reveals highly significant declines for both papers (p<0.001) and patents (p<0.001), a finding they neither acknowledge nor interpret. Their critique is undermined by methodological deficiencies, including reliance on visual inspection without statistical assessment, and severe data quality issues in their SciSciNet dataset, which contains nearly three times more 0-bcite papers than our original data. HATWG's departure from established scientometric practices -- notably their inclusion of document types and fields known for poor metadata quality -- invalidates their conclusions. Monte Carlo simulations and additional analyses using multiple disruptiveness measures across datasets further validate the robustness of the declining trend. Our findings collectively demonstrate that the observed decline in disruptiveness is not an artifact of 0-bcite works but represents a substantive change in scientific and technological innovation patterns.

Paper Structure

This paper contains 19 sections, 7 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Declining Disruptiveness Matches Major Benchmark Transformations in Science After Excluding Zero-Backward-Citation Works. The left panel plots the average percentile values of disruptiveness by field over time, calculated using the precomputed disruption scores in SciSciNet lin2023sciscinet, which exclude 0-bcite papers and are consistent with HATWG's advocated methodology. The right panel plots the percentile values of major benchmark transformations in science, including mean reference age evans2008electroniclariviere2008longchu2021slowedverstak2014shoulderssinatra2015centurywang2021science, proportion of team members with a career age $>$20 years cui2022agingwang2021sciencejones2011agejones2010ageblau2017usalberts2015addressing, proportion of female team members nsf2021huang2020historical, countries per team wang2021sciencewagner2017growthlin2023remoteadams2013fourthribeiro2018growthleydesdorff2008internationalluukkonen1992understanding, and mean team size wuchty2007increasingjones2021risewang2021sciencemilojevic2014principles. Disruptiveness (in black) is plotted alongside these benchmarks for comparison. Even after excluding papers that make 0 backward citations, the magnitude of the decline in disruptiveness is comparable to these well-documented trends, underscoring its robust practical significance. Corresponding regression analyses in Table \ref{['table:RegressionBenchmarkMeasures']} verify that all trends, including disruptiveness overall and within fields, as well as all the benchmarks, are statistically significant at the p$<$0.001 level. Shaded bands correspond to 95% confidence intervals.
  • Figure 2: Persistent Declines in Disruptiveness Using HATWG's Proposed Regression Model. This figure plots the predicted values of the CD$_5$ index (disruptiveness) for papers in Web of Science (left panel) and patents in Patents View (right panel), obtained from the coefficient estimates in Table \ref{['table:RegressionAdjustment']}. Our regressions are based on HATWG's regression model (c.f. HATWG, Table S1), which includes their proposed dummy variable for 0-bcite works, in addition to the full suite of control variables used in PLF (see their Extended Data Figure 8). From 1945 to 2010, the predicted disruptiveness for papers declines by $\beta$=-0.082 (p$<$0.001). The magnitude of decline is important, equalling the difference between an average paper (CD$_5$=0.040) and a Nobel-winner (CD$_5$=0.131) li2019dataset, or moving a median-ranked paper rising to the 93rd percentile. For patents, the 1980-2010 decline is even more pronounced ($\beta$=-0.155, p$<$0.001), and is comparable to the gap between an average patent (CD$_5$=0.123) and the average of the 37 landmark (1980 onwards) patents identified by Kelly et al. kelly2021measuring (CD$_5$=0.270), or a median-ranked patent rising to the 84th percentile. Thus, even after applying HATWG's adjustments, the decline in disruptiveness over time remains both statistically significant and practically meaningful. Shaded bands correspond to 95% confidence intervals.
  • Figure 3: Persistent Decline in Disruptiveness Relative to Randomly Rewired Citation Networks. This figure compares the temporal trends in average disruptiveness for papers (Web of Science, left) and patents (Patents View, right), shown alongside average disruptiveness in comparable randomly rewired citation networks. We measure disruptiveness using CD$_5^{noK}$, which has been independently validated as a disruption indicator that excludes the original CD index's $n_K$ term (bornmann2020disruptionbornmann2020disruptiveleibel2024weleydesdorff2020proposaldeng2023enhancing), which is preserved in the rewiring process of HATWG's simulations (see Sec. \ref{['sec:MathematicalProperties']} for a formal mathematical demonstration). Because $n_K$ is preserved in the rewired networks, trends in disruptiveness that are attributable to $n_K$ will be present in both the observed and rewired networks, thereby making direct comparisons of the average disruptiveness in the observed and rewired networks using the original CD index---HATWG's approach---inappropriate. The plots reveal that while both papers and patents show persistent declines in CD$_5^{noK}$ in the observed network, the rewired network maintains a flat trend. (Note that the flat trend results from nearly all "J"-type cites [future works citing the references of the focal work but not the focal work itself] switching to "I"-type cites [future works citing the focal work itself], leading to an average CD$_5^{noK}$ value of approximately 1; see Section S5.3 for additional mathematical details.) This provides robust evidence that the observed declines in disruptiveness are substantive rather than artifacts of the changing prevalence of 0-bcite papers/patents or other similar factors. For analyses using the original CD index, the appropriate method is to 'net out' the level of disruptiveness attributable to structural properties of the citation network by comparing the observed disruptiveness to the disruptiveness in randomly rewired networks at the level of individual papers (or patents). PLF implements this adjustment using $z$-scores (see PLF Extended Data Figure 8). An alternative approach is to estimate a regression model that predicts the observed CD index as a function of time (year dummies) while controlling for the CD index value in the randomly rewired networks for each paper or patent. The results of this approach, shown in Table \ref{['table:RandomRegressionAdjustment']} and the corresponding predicted values plot (Fig. \ref{['fig:RandomNetworkRegressionPlot']}), provide further support that the observed declines in disruptiveness are not attributable to changes in citation network structure (e.g., the prevalence of 0-bcite papers or patents). Shaded bands correspond to 95% confidence intervals.
  • Figure 4: Departures from Scientometric Best Practices Lead to Severe Overrepresentation of Zero-Reference Works in HATWG's Data. This figure reveals severe data quality concerns in HATWG's SciSciNet dataset, particularly their handling of 0-bcite documents, which they identify as problematic. Their critique, however, applies more to their own dataset than to PLF's. Row 1 (a,b): SciSciNet's 0-bcite proportion is approximately three times larger than PLF's for both Web of Science (2.76 times higher) and Patents View (2.98 times higher). HATWG's effort to address this through 'Journal' and 'Conference' paper subsetting (their Figure S8) is ineffective, with 0-bcite proportions remaining nearly identical, revealing SciSciNet's insufficiently granular document type classification. Row 2 (c,d): Using a SciSciNet-to-WoS Research Areas crosswalk (Table \ref{['table:FieldsMapping']}), SciSciNet shows higher 0-bcite proportions across all fields. This disparity reaches its peak in Humanities, approaching 70%. While PLF excluded this field following scientometric standards, Humanities publications were included in HATWG, which inflated their count of 0-bcite documents. Row 3 (e,f): The analysis by document type reveals another crucial issue. While PLF adhered to scientometric best practices by including only research articles (yielding fewer 0-bcite documents), HATWG's SciSciNet data incorporated a wide range of document types (e.g., news items, corrections, commentaries, book reviews) that typically lack citations, thereby inflating the proportion of 0-bcite documents in their data. As an example, the left inset of panel f shows that most documents published in Nature and coded as 'Journal' in SciSciNet are editorial, commentary, or other non-research pieces, yet they were included in HATWG's analysis. The proportion of 0-bcite documents among these unconventional document types vastly exceeds that of research articles (right inset panel), highlighting the critical importance of proper document type selection in the context of HATWG's critique. Granular document type classifications for SciSciNet documents were determined through DOI-based matching with Dimensions.ai (36,530,788 of 45,251,912 papers, or 80.73% of HATWG's SciSciNet papers were successfully linked), with crosswalk details in Table \ref{['table:MetaCategories']}.
  • Figure S1: Persistent Decline Across Independently Developed Disruptiveness Measures. This figure demonstrates that our finding of a persistent decline in disruptiveness even when 0-bcite papers are excluded is not sensitive to the choice of disruption metric or data set. Specifically, the plots track the average (percentile) values of four independently developed measures of disruptiveness. Both plots exclude all 0-bcite papers. Values of the disruptiveness measures are plotted separately for Web of Science (left panel) and SciSciNet (right panel) across years. The measures include CYG$_5$ (Citation Year Gap), which calculates the average age gap between references cited by citing works relative to the focal paper brauer2023aggregatebrauer2024searching; Is D$_5$, a binary variable for whether the CD$_5$ value (disruptiveness index) is positive lin2023remoteli2024displacingyang2024female; CD$_5^{noK}$, which excludes references-only citations (sometimes referred to as the "$n_K$" term) from the denominator bornmann2020disruptionbornmann2020disruptiveleibel2024weleydesdorff2020proposaldeng2023enhancing; and CD$_5^5$, which introduces a threshold requiring future works to cite multiple references of the focal paper, emphasizing stronger bibliographic coupling li2024displacingwu2019solojiang2024newbornmann2020disruptionleibel2024we. All measures exclude 0-bcite documents and are percentile-normalized to enable comparison across scales. The figure shows a consistent decline in disruptiveness over time across all measures and datasets---consistent with the findings of PLF and supported by the regression results in Table \ref{['table:RegressionAlternativeMeasures']}---which demonstrate that the declines are statistically significant at the p$<$0.001 level. Shaded bands correspond to 95% confidence intervals.
  • ...and 3 more figures