Table of Contents
Fetching ...

Can Machines Learn the True Probabilities?

Jinsook Kim

TL;DR

Can Machines Learn the True Probabilities? investigates whether AI systems can identify and use the true objective probabilities $P(A_{t+1}|eta_t)$ governing data generation. It defines learning as aligning the machine's subjective forecast $\Pi(A_{t+1}|eta_t)$ with $P(A_{t+1}|eta_t)$ under a success criterion and shows learnability holds only when the true probabilities are directly observable by empirical frequency, i.e., via a population. The paper proves impossibility results under Nature's perversity—formalized via cases and counterexamples such as Oakes—and identifies a singular idealized scenario where learning is possible: when a stopping time $t_s$ exists and direct observability holds. These results tie calibration to observability, relate computability to the problem, and imply limited scope for relaxing standard time-series assumptions like stationarity or ergodicity.

Abstract

When there exists uncertainty, AI machines are designed to make decisions so as to reach the best expected outcomes. Expectations are based on true facts about the objective environment the machines interact with, and those facts can be encoded into AI models in the form of true objective probability functions. Accordingly, AI models involve probabilistic machine learning in which the probabilities should be objectively interpreted. We prove under some basic assumptions when machines can learn the true objective probabilities, if any, and when machines cannot learn them.

Can Machines Learn the True Probabilities?

TL;DR

Can Machines Learn the True Probabilities? investigates whether AI systems can identify and use the true objective probabilities governing data generation. It defines learning as aligning the machine's subjective forecast with under a success criterion and shows learnability holds only when the true probabilities are directly observable by empirical frequency, i.e., via a population. The paper proves impossibility results under Nature's perversity—formalized via cases and counterexamples such as Oakes—and identifies a singular idealized scenario where learning is possible: when a stopping time exists and direct observability holds. These results tie calibration to observability, relate computability to the problem, and imply limited scope for relaxing standard time-series assumptions like stationarity or ergodicity.

Abstract

When there exists uncertainty, AI machines are designed to make decisions so as to reach the best expected outcomes. Expectations are based on true facts about the objective environment the machines interact with, and those facts can be encoded into AI models in the form of true objective probability functions. Accordingly, AI models involve probabilistic machine learning in which the probabilities should be objectively interpreted. We prove under some basic assumptions when machines can learn the true objective probabilities, if any, and when machines cannot learn them.
Paper Structure (13 sections, 20 theorems)

This paper contains 13 sections, 20 theorems.

Key Result

Theorem 4.1

Suppose that $\xi_{t}\ $is ß $_{t-1}$ measurable. Then, $\Pi$$(p_{k}\rightarrow$$\alpha) =1$ when $k\to\infty$, where $k$: the number of days in the test set $p_k=( {\sum\limits_{t=1}^{k}} \xi_t)^{-1}\cdot( {\sum\limits_{t=1}^{k}} \xi_t\cdot 1_{\{A_{t+1}\}})$

Theorems & Definitions (40)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Remark 2.4
  • Remark 3.1
  • Remark 3.2
  • Theorem 4.1
  • Lemma 4.5
  • Theorem 4.6
  • Remark 4.7
  • ...and 30 more