Table of Contents
Fetching ...

Information Content and Entropy of Finite Patterns from a Combinatorial Perspective

Zsolt Pocze

TL;DR

This work presents a unified combinatorial framework for the information content and entropy of finite patterns, extending beyond traditional Shannon information. It defines the information content I(A) as the minimum number of binary decisions needed to specify a pattern and connects it to Kolmogorov complexity, while anchoring the theory with edge cases such as constant, uniformly random, and ergodic Markov patterns. The paper derives explicit formulas for these edge cases, establishes general bounds and properties (normalization, subadditivity, reversibility, monotonicity, redundancy), and introduces practical estimation via Kolmogorov-based, compression-based, and alternative measures. Entropy is then defined as H_C(A)=I(A)/(n+1), bridging short-pattern behavior with asymptotic Shannon entropy for ergodic Markov sources, and enabling robust analysis of diverse data types, including short sequences. The framework supports combining multiple estimation methods to improve accuracy and provides a foundation for broader applications in information theory and data analysis.

Abstract

A unified combinatorial definition of the information content and entropy of different types of patterns, compatible with the traditional concepts of information and entropy, going beyond the limitations of Shannon information interpretable for ergodic Markov processes. We compare the information content of various finite patterns and derive general properties of information quantity from these comparisons. Using these properties, we define normalized information estimation methods based on compression algorithms and Kolmogorov complexity. From a combinatorial point of view, we redefine the concept of entropy in a way that is asymptotically compatible with traditional entropy.

Information Content and Entropy of Finite Patterns from a Combinatorial Perspective

TL;DR

This work presents a unified combinatorial framework for the information content and entropy of finite patterns, extending beyond traditional Shannon information. It defines the information content I(A) as the minimum number of binary decisions needed to specify a pattern and connects it to Kolmogorov complexity, while anchoring the theory with edge cases such as constant, uniformly random, and ergodic Markov patterns. The paper derives explicit formulas for these edge cases, establishes general bounds and properties (normalization, subadditivity, reversibility, monotonicity, redundancy), and introduces practical estimation via Kolmogorov-based, compression-based, and alternative measures. Entropy is then defined as H_C(A)=I(A)/(n+1), bridging short-pattern behavior with asymptotic Shannon entropy for ergodic Markov sources, and enabling robust analysis of diverse data types, including short sequences. The framework supports combining multiple estimation methods to improve accuracy and provides a foundation for broader applications in information theory and data analysis.

Abstract

A unified combinatorial definition of the information content and entropy of different types of patterns, compatible with the traditional concepts of information and entropy, going beyond the limitations of Shannon information interpretable for ergodic Markov processes. We compare the information content of various finite patterns and derive general properties of information quantity from these comparisons. Using these properties, we define normalized information estimation methods based on compression algorithms and Kolmogorov complexity. From a combinatorial point of view, we redefine the concept of entropy in a way that is asymptotically compatible with traditional entropy.
Paper Structure (10 sections, 4 theorems, 21 equations, 2 figures, 3 tables)

This paper contains 10 sections, 4 theorems, 21 equations, 2 figures, 3 tables.

Key Result

Theorem 1

For finite patterns that can be generated by an ergodic Markov process, the value of $I_{mark}(A)$ is maximized precisely when the relative frequencies of the values are equal, i.e., $f_{rel}(x)=\frac{1}{k},\forall x\in X\:\boxempty$.

Figures (2)

  • Figure 1: In the figure, we can see a comparison of the information values of various 1,000-character-long patterns (APPENDIX I) that differ in their sets of possible symbols. M denotes the maximum amount of information possible for a pattern of the given length and symbol set. S is the pattern’s modified Shannon information. T is the pattern’s information as measured by the GZip compression algorithm. K is the pattern’s approximate Kolmogorov complexity. The “random pattern” is a random binary pattern with a certain degree of redundancy, whereas the “structured pattern” is a 40× 25 binary character matrix in which the '1' symbols are arranged in concentric circles.It is apparent that, because of its seeming randomness, even the compression algorithm could not determine the Fibonacci sequence’s information content, whereas its Kolmogorov complexity indicated a low information content. For the English text and the random pattern, both the Shannon-based method and the compression algorithm provided good results. In the case of structured text, however, the compression algorithm clearly gives a closer approximation of the real information content than the Shannon formula, which was originally designed for random patterns. (The algorithms used are described in APPENDIX II--IV.)
  • Figure 2: Entropy of a constant pattern as a function of $n$.

Theorems & Definitions (12)

  • Definition 1
  • Definition 2
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Definition 3
  • Definition 4
  • Definition 5
  • Theorem 4
  • Definition 6
  • ...and 2 more