Simplified derivations for high-dimensional convex learning problems

David G. Clark; Haim Sompolinsky

Simplified derivations for high-dimensional convex learning problems

David G. Clark, Haim Sompolinsky

TL;DR

The paper introduces a concise, non-replica cavity framework for high-dimensional convex learning problems, unifying perceptron point classification, perceptron manifold classification, and kernel ridge regression through a bipartite interaction between feature and datum variables. By exploiting a zero-temperature cavity method and symmetry arguments, it derives exact capacity and generalization-related results, clarifying why naive mean-field analyses can succeed in some cases due to underlying structure. A central outcome is the computation of explicit response functions S^w and S^λ that govern how capacities scale with data and how cavities on both sides reconcile inconsistencies in naive analyses. The framework extends readily to correlated data and dynamical settings, offering a tractable, intuitive route to understand high-dimensional learning systems and their generalization properties in terms of a common bipartite geometry.

Abstract

Statistical-physics calculations in machine learning and theoretical neuroscience often involve lengthy derivations that obscure physical interpretation. Here, we give concise, non-replica derivations of several key results and highlight their underlying similarities. In particular, using a cavity approach, we analyze three high-dimensional learning problems: perceptron classification of points, perceptron classification of manifolds, and kernel ridge regression. These problems share a common structure--a bipartite system of interacting feature and datum variables--enabling a unified analysis. Furthermore, for perceptron-capacity problems, we identify a symmetry that allows derivation of correct capacities through a naive method.

Simplified derivations for high-dimensional convex learning problems

TL;DR

Abstract

Simplified derivations for high-dimensional convex learning problems

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)