Understanding Main Path Analysis

H. C. W. Price; T. S. Evans

Understanding Main Path Analysis

H. C. W. Price, T. S. Evans

TL;DR

This work addresses the lack of theoretical grounding in Main Path Analysis by establishing an information-theoretic and geometric basis for edge weights and path selection. It introduces a basket-based framework using generalised criticality to capture near-optimal and diverse core nodes, demonstrating robustness and scalability across artificial DAGs and real-world networks. The study shows that traditional SPC/SPE single-path methods offer little advantage over simpler unit-weight approaches, and that baskets effectively summarize the backbone of knowledge flows. Overall, the paper provides a practical, interpretable methodology for identifying key knowledge structures in large DAGs, with broad implications for bibliometrics and network science.

Abstract

Main path analysis has long been used to trace knowledge trajectories in citation networks, yet it lacks solid theoretical foundations. To understand when and why this approach succeeds, we analyse directed acyclic graphs created from two types of artificial models and by looking at over twenty networks derived from real data. We show that entropy-based variants of main path analysis optimise geometric distance measures, providing its first information-theoretic and geometric basis. Numerical results demonstrate that existing algorithms converge on near-geodesic solutions. We also show that an approach based on longest paths produces similar results, is equally well motivated yet is much simpler to implement. However, the traditional single-path focus is unnecessarily restrictive, as many near-optimal paths highlight different key nodes. We introduce an approach using ``baskets'' of nodes where we select a fraction of nodes with the smallest values of a measure we call ``generalised criticality''. Analysis of large vaccine citation networks shows that these baskets achieve comprehensive algorithmic coverage, offering a robust, simple, and computationally efficient way to identify core knowledge structures. In practice, we find that those nodes with zero unit criticality capture the information in main paths in almost all cases and capture a wider range of key nodes without unnecessarily increasing the number of nodes considered. We find no advantage in using the traditional main path methods.

Understanding Main Path Analysis

TL;DR

Abstract

Understanding Main Path Analysis

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (26)