Table of Contents
Fetching ...

Locating the Leading Edge of Cultural Change

Sarah Griebel, Becca Cohen, Lucian Li, Jaihyun Park, Jiayu Liu, Jana Perkins, Ted Underwood

Abstract

Measures of textual similarity and divergence are increasingly used to study cultural change. But which measures align, in practice, with social evidence about change? We apply three different representations of text (topic models, document embeddings, and word-level perplexity) to three different corpora (literary studies, economics, and fiction). In every case, works by highly-cited authors and younger authors are textually ahead of the curve. We don't find clear evidence that one representation of text is to be preferred over the others. But alignment with social evidence is strongest when texts are represented through the top quartile of passages, suggesting that a text's impact may depend more on its most forward-looking moments than on sustaining a high level of innovation throughout.

Locating the Leading Edge of Cultural Change

Abstract

Measures of textual similarity and divergence are increasingly used to study cultural change. But which measures align, in practice, with social evidence about change? We apply three different representations of text (topic models, document embeddings, and word-level perplexity) to three different corpora (literary studies, economics, and fiction). In every case, works by highly-cited authors and younger authors are textually ahead of the curve. We don't find clear evidence that one representation of text is to be preferred over the others. But alignment with social evidence is strongest when texts are represented through the top quartile of passages, suggesting that a text's impact may depend more on its most forward-looking moments than on sustaining a high level of innovation throughout.

Paper Structure

This paper contains 22 sections, 1 equation, 1 figure, 1 table.

Figures (1)

  • Figure 1: Literary studies journals and authors plotted by the average precocity and number of citations associated with their articles. Precocity is determined here by topic modeling. Since both axes are z-scores, the center of the whole corpus would be at 0, 0. We’re looking mostly at the upper right quadrant.