Why Code, Why Now: Learnability, Computability, and the Real Limits of Machine Learning

Zhimin Zhao

Why Code, Why Now: Learnability, Computability, and the Real Limits of Machine Learning

Zhimin Zhao

TL;DR

The paper addresses why code generation scales more predictably than reinforcement learning by arguing that task learnability is governed by information structure rather than model size. It introduces a five-level hierarchy of learnability based on feedback quality and formalizes expressibility, computability, and learnability within a unified template using risk functionals. The analysis explains why supervised learning on code benefits from dense, locally verifiable signals, while RL suffers from misaligned, non-stationary, or reflexive rewards, leading to weaker scaling behavior. It also proposes practical paths forward—task decomposition, engineered feedback, and weaker objectives—to transform unlearnable tasks into learnable ones and guide future scaling efforts.

Abstract

Code generation has progressed more reliably than reinforcement learning, largely because code has an information structure that makes it learnable. Code provides dense, local, verifiable feedback at every token, whereas most reinforcement learning problems do not. This difference in feedback quality is not binary but graded. We propose a five-level hierarchy of learnability based on information structure and argue that the ceiling on ML progress depends less on model size than on whether a task is learnable at all. The hierarchy rests on a formal distinction among three properties of computational problems (expressibility, computability, and learnability). We establish their pairwise relationships, including where implications hold and where they fail, and present a unified template that makes the structural differences explicit. The analysis suggests why supervised learning on code scales predictably while reinforcement learning does not, and why the common assumption that scaling alone will solve remaining ML challenges warrants scrutiny.

Why Code, Why Now: Learnability, Computability, and the Real Limits of Machine Learning

TL;DR

Abstract

Paper Structure (43 sections, 2 theorems, 12 equations, 7 tables)

This paper contains 43 sections, 2 theorems, 12 equations, 7 tables.

Introduction
Related Work
Language identification and generation in the limit.
PAC learning and VC theory.
Computability and undecidability.
Scaling laws and reinforcement learning.
Goodhart's law and reward misspecification.
What Makes Code Special
Hard Syntactic Constraints
Locally Identifiable Errors
Strong Compositionality
Why Supervised Learning Outperforms Reinforcement Learning on Code
Formal Grounding
A Hierarchy of Learnability
Formal Foundations
...and 28 more sections

Key Result

Proposition 1

Theorems & Definitions (23)

Definition 1: Expressibility
Remark 1: Expressibility is relative
Remark 2: Well-definedness of the supremum
Remark 3: Risk as zero--one uniform loss
Definition 2: Computability
Remark 4: Totality is essential
Definition 3: PAC learnability
Remark 5: Realizable case
Remark 6: Improper learning
Remark 7: Sample complexity and VC dimension
...and 13 more

Why Code, Why Now: Learnability, Computability, and the Real Limits of Machine Learning

TL;DR

Abstract

Why Code, Why Now: Learnability, Computability, and the Real Limits of Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (23)