Towards a Law of Iterated Expectations for Heuristic Estimators

Paul Christiano; Jacob Hilton; Andrea Lincoln; Eric Neyman; Mark Xu

Towards a Law of Iterated Expectations for Heuristic Estimators

Paul Christiano, Jacob Hilton, Andrea Lincoln, Eric Neyman, Mark Xu

Abstract

Christiano et al. (2022) define a *heuristic estimator* to be a hypothetical algorithm that estimates the values of mathematical expressions from arguments. In brief, a heuristic estimator $\mathbb{G}$ takes as input a mathematical expression $Y$ and a formal "heuristic argument" $π$, and outputs an estimate $\mathbb{G}(Y \mid π)$ of $Y$. In this work, we argue for the informal principle that a heuristic estimator ought not to be able to predict its own errors, and we explore approaches to formalizing this principle. Most simply, the principle suggests that $\mathbb{G}(Y - \mathbb{G}(Y \mid π) \mid π)$ ought to equal zero for all $Y$ and $π$. We argue that an ideal heuristic estimator ought to satisfy two stronger properties in this vein, which we term *iterated estimation* (by analogy to the law of iterated expectations) and *error orthogonality*. Although iterated estimation and error orthogonality are intuitively appealing, it can be difficult to determine whether a given heuristic estimator satisfies the properties. As an alternative approach, we explore *accuracy*: a property that (roughly) states that $\mathbb{G}$ has zero average error over a distribution of mathematical expressions. However, in the context of two estimation problems, we demonstrate barriers to creating an accurate heuristic estimator. We finish by discussing challenges and potential paths forward for finding a heuristic estimator that accords with our intuitive understanding of how such an estimator ought to behave, as well as the potential applications of heuristic estimators to understanding the behavior of neural networks.

Towards a Law of Iterated Expectations for Heuristic Estimators

Abstract

Christiano et al. (2022) define a *heuristic estimator* to be a hypothetical algorithm that estimates the values of mathematical expressions from arguments. In brief, a heuristic estimator

takes as input a mathematical expression

and a formal "heuristic argument"

, and outputs an estimate

. In this work, we argue for the informal principle that a heuristic estimator ought not to be able to predict its own errors, and we explore approaches to formalizing this principle. Most simply, the principle suggests that

ought to equal zero for all

and

. We argue that an ideal heuristic estimator ought to satisfy two stronger properties in this vein, which we term *iterated estimation* (by analogy to the law of iterated expectations) and *error orthogonality*. Although iterated estimation and error orthogonality are intuitively appealing, it can be difficult to determine whether a given heuristic estimator satisfies the properties. As an alternative approach, we explore *accuracy*: a property that (roughly) states that

has zero average error over a distribution of mathematical expressions. However, in the context of two estimation problems, we demonstrate barriers to creating an accurate heuristic estimator. We finish by discussing challenges and potential paths forward for finding a heuristic estimator that accords with our intuitive understanding of how such an estimator ought to behave, as well as the potential applications of heuristic estimators to understanding the behavior of neural networks.

Paper Structure (35 sections, 10 theorems, 83 equations, 1 figure, 2 tables, 1 algorithm)

This paper contains 35 sections, 10 theorems, 83 equations, 1 figure, 2 tables, 1 algorithm.

Introduction
A running example
Perspectives on heuristic estimation
Analogy to proof verification.
Analogy to conditional expectations.
Analogy to subjective probabilities and estimates.
The principle of unpredictable errors
Outline of this work
Related work
The subjective properties: Iterated estimation and error orthogonality
Motivation and definitions
Challenges with the subjective approach
Accuracy as an objective measure of error unpredictability
Motivation and definitions
Multiaccuracy as a constraint on argument merges
...and 20 more sections

Key Result

Proposition 2.3

Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space with $\sigma$-sub-algebras $\mathcal{H}' \subseteq \mathcal{H} \subseteq \mathcal{F}$. Let $X, Y$ be random variables satisfying $\mathbb{E}_{} \left[X^2\right], \mathbb{E}_{} \left[Y^2\right] < \infty$. Then

Figures (1)

Figure 1: Let $\mathcal{Y}$ be the space of expressions of the form $2 \cdot c_1 + 3 \cdot c_2$, where $c_1, c_2 \in \mathbb{R}$, and let $\mathcal{D}$ be the distribution over $\mathcal{Y}$ obtained by selecting $c_1, c_2$ independently from $\mathcal{N}(0, 1)$. This figure classifies estimators of $Y \in \mathcal{Y}$ based on whether they are $1$-accurate, $c_1$-accurate, and self-accurate over $\mathcal{D}$.

Theorems & Definitions (43)

Definition 2.1
Example 2.2
Proposition 2.3: Projection law of conditional expectations, see e.g. moshayedi2022conditional
Definition 2.4
Example 2.5
Definition 3.1
Example 3.2
Remark 3.3
Proposition 3.4
proof
...and 33 more

Towards a Law of Iterated Expectations for Heuristic Estimators

Abstract

Towards a Law of Iterated Expectations for Heuristic Estimators

Authors

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (43)