Competition between Two Kinds of Correlations in Literary Texts

S. S. Melnyk; O. V. Usatenko; V. A. Yampol'skii; V. A. Golick

Competition between Two Kinds of Correlations in Literary Texts

S. S. Melnyk, O. V. Usatenko, V. A. Yampol'skii, V. A. Golick

TL;DR

The paper addresses how to quantify and model long-range correlations in coarse-grained literary texts using additive Markov chains with memory functions. It develops a framework linking memory functions to observed variance and correlation, and demonstrates that texts exhibit antipersistent short-range and power-law persistent long-range correlations, which together shape text statistics. Through analysis of the Bible and other works, it shows a robust, two-regime memory structure and reveals self-similarity under decimation, highlighting grammatical versus semantic contributions. The approach provides a compact, transferable descriptor (the memory function) for symbolic sequences and suggests broader applications to other complex correlated systems.

Abstract

A theory of additive Markov chains with long-range memory is used for description of correlation properties of coarse-grained literary texts. The complex structure of the correlations in texts is revealed. Antipersistent correlations at small distances, L < 300, and persistent ones at L > 300 define this nontrivial structure. For some concrete examples of literary texts, the memory functions are obtained and their power-law behavior at long distances is disclosed. This property is shown to be a cause of self-similarity of texts with respect to the decimation procedure.

Competition between Two Kinds of Correlations in Literary Texts

TL;DR

Abstract

Competition between Two Kinds of Correlations in Literary Texts

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)