A ripple in time: a discontinuity in American history

Alexander Kolpakov; Igor Rivin

A ripple in time: a discontinuity in American history

Alexander Kolpakov, Igor Rivin

TL;DR

The paper tackles the challenge of extracting temporal structure and authorship signals from a compact, variable-length historical text corpus (State of the Union addresses). It employs transformer-based embeddings (GPT-2, DistilBERT) combined with nonlinear dimension reduction (UMAP, TriMAP, PaCMAP) to reveal latent temporal patterns and to perform authorship attribution without heavy fine-tuning. Key findings include near-90–95% author attribution accuracy and year estimates within about a presidential term, with a pronounced historical ripple around the late 1920s that appears across methods. The work demonstrates that modern embedding and clustering techniques can uncover meaningful historical patterns in small corpora and are readily reproducible on standard hardware via public code and data.

Abstract

In this technical note we suggest a novel approach to discover temporal (related and unrelated to language dilation) and personality (authorship attribution) aspects in historical datasets. We exemplify our approach on the State of the Union addresses given by the past 42 US presidents: this dataset is known for its relatively small amount of data, and high variability of the size and style of texts. Nevertheless, we manage to achieve about 95\% accuracy on the authorship attribution task, and pin down the date of writing to a single presidential term.

A ripple in time: a discontinuity in American history

TL;DR

Abstract

Paper Structure (15 sections, 8 figures)

This paper contains 15 sections, 8 figures.

Results and novelty
Problems and methods
Dataset
Methods of study
Results and observations
Temporal clustering
BERT vs GPT--2
Authorship attribution of the addresses
Year of writing of an address
Discussion
Data accessibility
Authors’ contributions
Acknowledgments
Dimension reductions of GPT--2 embedding
Dimension reductions of DistilBERT embedding

Figures (8)

Figure 1: UMAP visualizations of the GPT--2 embedding of SOTU
Figure 2: Temporal clustering of SOTU embeddings
Figure 3: 2D visualizations of the GPT--2 embedding of SOTU
Figure 4: 3D visualizations of the GPT--2 embedding of SOTU
Figure 5: Temporal clustering of SOTU addresses: GPT--2 embedding followed by a dimension reduction technique
...and 3 more figures

A ripple in time: a discontinuity in American history

TL;DR

Abstract

A ripple in time: a discontinuity in American history

TL;DR

Abstract

Table of Contents

Figures (8)