Table of Contents
Fetching ...

Towards a Theory of Pragmatic Information

Edward D. Weinberger

TL;DR

The paper introduces pragmatic information, a decision-centered analogue of information theory, defined as the KL divergence between a decision maker’s posterior and prior beliefs about the world, $D_\Delta(\boldsymbol{p}_m \,\|\boldsymbol{q})$. It extends this to ensembles, derives a free-information interpretation $\Phi_Δ(\mathcal{M}; Ω)=I(\mathcal{M}; Ω)+D_Δ(p\|q)$, and establishes core properties including non-negativity, additivity, and a chain-rule structure. The framework is illustrated with a one-armed bandit example, connects to mutual information and encoding efficiency, and is applied to reformulate the efficient market hypothesis and to discuss the role of computational capacity in information processing. The work suggests that pragmatic information captures the usable, decision-relevant content of information, with implications for biology, finance, and collective intelligence, and highlights the importance of receivers’ computational constraints in shaping informational value.

Abstract

Standard information theory says nothing about how much meaning is conveyed by a message. We fill this gap with a rigorously justifiable, quantitative definition of ``pragmatic information'', the amount of meaning in a message relevant to a particular decision. We posit that such a message updates a random variable, $ω$, that informs the decision. The pragmatic information of a single message is then defined as the Kulbach-Leibler divergence between the prior and posterior probabilities of $ω$; the pragmatic information of a message ensemble is the expected value of the pragmatic information of the ensemble's component messages. We justify these definitions by proving that the pragmatic information of a single message is the expected difference between the shortest binary encoding of $ω$ under the a priori and a posteriori distributions, and that the average of the pragmatic values of individual messages, when sampled a large number of times from the ensemble, approaches its expected value. Pragmatic information is non-negative and additive for independent decisions and ``pragmatically independent'' messages. Also, pragmatic information is the information analogue of free energy: just as free energy quantifies the part of a system's total energy available to do useful work, so pragmatic information quantifies the information actually used in making a decision. We sketch 3 applications: the single play of a slot machine, a.k.a. a ``one armed bandit'', with an unknown payout probability; a characterization of the rate of biological evolution in the so-called ``quasi-species'' model; and a reformulation of the efficient market hypothesis of finance. We note the importance of the computational capacity of the receiver in each case.

Towards a Theory of Pragmatic Information

TL;DR

The paper introduces pragmatic information, a decision-centered analogue of information theory, defined as the KL divergence between a decision maker’s posterior and prior beliefs about the world, . It extends this to ensembles, derives a free-information interpretation , and establishes core properties including non-negativity, additivity, and a chain-rule structure. The framework is illustrated with a one-armed bandit example, connects to mutual information and encoding efficiency, and is applied to reformulate the efficient market hypothesis and to discuss the role of computational capacity in information processing. The work suggests that pragmatic information captures the usable, decision-relevant content of information, with implications for biology, finance, and collective intelligence, and highlights the importance of receivers’ computational constraints in shaping informational value.

Abstract

Standard information theory says nothing about how much meaning is conveyed by a message. We fill this gap with a rigorously justifiable, quantitative definition of ``pragmatic information'', the amount of meaning in a message relevant to a particular decision. We posit that such a message updates a random variable, , that informs the decision. The pragmatic information of a single message is then defined as the Kulbach-Leibler divergence between the prior and posterior probabilities of ; the pragmatic information of a message ensemble is the expected value of the pragmatic information of the ensemble's component messages. We justify these definitions by proving that the pragmatic information of a single message is the expected difference between the shortest binary encoding of under the a priori and a posteriori distributions, and that the average of the pragmatic values of individual messages, when sampled a large number of times from the ensemble, approaches its expected value. Pragmatic information is non-negative and additive for independent decisions and ``pragmatically independent'' messages. Also, pragmatic information is the information analogue of free energy: just as free energy quantifies the part of a system's total energy available to do useful work, so pragmatic information quantifies the information actually used in making a decision. We sketch 3 applications: the single play of a slot machine, a.k.a. a ``one armed bandit'', with an unknown payout probability; a characterization of the rate of biological evolution in the so-called ``quasi-species'' model; and a reformulation of the efficient market hypothesis of finance. We note the importance of the computational capacity of the receiver in each case.
Paper Structure (13 sections, 13 theorems, 44 equations, 1 figure)

This paper contains 13 sections, 13 theorems, 44 equations, 1 figure.

Key Result

Theorem 2.3.1

Suppose $L_{\mathbf q}(\omega)$ is the length, in bits, of the shortest binary code required to communicate that $\Delta$ has decided upon outcome $\omega$, assuming the prior probabilities $\mathbf q$, and suppose $L_{\mathbf p_m}(\omega)$ is the corresponding length, assuming the prior probabiliti where ${\mathcal{E}}\left[L_{\mathbf q}(\omega) - L_{\mathbf p_m}(\omega)\right]$ is the expected

Figures (1)

  • Figure 1: Pragmatic information for the "one armed bandit"

Theorems & Definitions (23)

  • Definition
  • Theorem 2.3.1: Wrong Code Theorem
  • proof
  • Theorem 2.3.2: Chain Rule for Kullbach-Leibler Divergence
  • Corollary : Additivity of Kullbach-Leibler Divergence
  • Theorem 2.3.3: Non-negativity of Kullbach-Leibler Divergence
  • Theorem 2.3.4: Convexity of Kullbach-Leibler Divergence
  • Definition
  • Corollary
  • proof
  • ...and 13 more