Table of Contents
Fetching ...

The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory

Romain Peyrichou

TL;DR

This work identifies six dimensions along which generation and recognition diverge: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality, and connects the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), arguing that surprisal formalizes the temporal asymmetry.

Abstract

Every formal grammar defines a language and can in principle be used in three ways: to generate strings (production), to recognize them (parsing), or -- given only examples -- to infer the grammar itself (grammar induction). Generation and recognition are extensionally equivalent -- they characterize the same set -- but operationally asymmetric in multiple independent ways. Inference is a qualitatively harder problem: it does not have access to a known grammar. Despite the centrality of this triad to compiler design, natural language processing, and formal language theory, no survey has treated it as a unified, multidimensional phenomenon. We identify six dimensions along which generation and recognition diverge: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality. We show that the common characterization "generation is easy, parsing is hard" is misleading: unconstrained generation is trivial, but generation under constraints can be NP-hard. The real asymmetry is that parsing is always constrained (the input is given) while generation need not be. Two of these dimensions -- directionality and temporality -- have not previously been identified as dimensions of the generation-recognition asymmetry. We connect the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), arguing that surprisal formalizes the temporal asymmetry between a generator (surprisal = 0) and a parser that predicts under uncertainty (surprisal > 0). We review bidirectional systems in NLP and observe that bidirectionality has been available for fifty years yet has not transferred to most domain-specific applications. We conclude with a discussion of large language models, which architecturally unify generation and recognition while operationally preserving the asymmetry.

The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory

TL;DR

This work identifies six dimensions along which generation and recognition diverge: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality, and connects the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), arguing that surprisal formalizes the temporal asymmetry.

Abstract

Every formal grammar defines a language and can in principle be used in three ways: to generate strings (production), to recognize them (parsing), or -- given only examples -- to infer the grammar itself (grammar induction). Generation and recognition are extensionally equivalent -- they characterize the same set -- but operationally asymmetric in multiple independent ways. Inference is a qualitatively harder problem: it does not have access to a known grammar. Despite the centrality of this triad to compiler design, natural language processing, and formal language theory, no survey has treated it as a unified, multidimensional phenomenon. We identify six dimensions along which generation and recognition diverge: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality. We show that the common characterization "generation is easy, parsing is hard" is misleading: unconstrained generation is trivial, but generation under constraints can be NP-hard. The real asymmetry is that parsing is always constrained (the input is given) while generation need not be. Two of these dimensions -- directionality and temporality -- have not previously been identified as dimensions of the generation-recognition asymmetry. We connect the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), arguing that surprisal formalizes the temporal asymmetry between a generator (surprisal = 0) and a parser that predicts under uncertainty (surprisal > 0). We review bidirectional systems in NLP and observe that bidirectionality has been available for fifty years yet has not transferred to most domain-specific applications. We conclude with a discussion of large language models, which architecturally unify generation and recognition while operationally preserving the asymmetry.
Paper Structure (35 sections, 14 equations, 6 figures, 1 table)

This paper contains 35 sections, 14 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Shannon's communication model and the generation-recognition analogy. The encoder (generator) knows the source with certainty; the decoder (recognizer) must infer it under equivocation $H(X|Y) > 0$.
  • Figure 2: Commutative diagram: the round-trip test. Generation followed by recognition should recover the original structure — the failure of this commutativity measures the asymmetry.
  • Figure 3: Complexity heatmaps for generation (top) and recognition (bottom). Rows represent task specification; columns represent grammar class (Chomsky hierarchy). Each complexity level has a unique color, consistent across both matrices and Figure 4 below. The gradient — from green ($\mathcal{O}(n)$) through yellow, orange, and red to dark gray ($\notin \mathsf{R}$) — reveals the differential coupling: in the generation matrix, color varies primarily along rows; in the recognition matrix, it varies primarily along columns.
  • Figure 4: The complexity gap (semi-log scale). Seven curves corresponding to the complexity levels identified in Figure 3, with a consistent color palette. Polynomial bounds appear as straight lines; exponential bounds diverge rapidly. The dashed vertical line marks undecidability ($\notin \mathsf{R}$).
  • Figure 5: The three operations: a hierarchy of difficulty. Canonical case ($k = 2$, simplest tasks). Full ranges: $\mathcal{O}(n)$ to $\notin \mathsf{R}$ — see §4.1.
  • ...and 1 more figures