Storage capacity of perceptron with variable selection

Yingying Xu; Masayuki Ohzeki; Yoshiyuki Kabashima

Storage capacity of perceptron with variable selection

Yingying Xu, Masayuki Ohzeki, Yoshiyuki Kabashima

TL;DR

This work analyzes how restricting a perceptron to a sparse subset of input features alters its storage capacity for random patterns. Using a nonrigorous replica approach, the authors quantify the typical number of feasible feature subsets and derive the capacity α_VS under optimal variable selection, showing it can exceed the classical Cover–Gardner bound α_CG=2ρ. The study identifies regions where replica symmetry holds and where AT instabilities imply replica symmetry breaking, highlighting a richer landscape for structure-versus-noise discrimination in high dimensions. Experimental simulations with BIHT-based methods corroborate qualitative gains from variable selection beyond α_CG, illuminating implications for sparse associative memories and resource-constrained learning. The results provide a principled criterion for when learned feature sets reflect genuine structure rather than chance correlations and offer a bridge between statistical mechanics and modern sparse learning theory.

Abstract

A central challenge in machine learning is to distinguish genuine structure from chance correlations in high-dimensional data. In this work, we address this issue for the perceptron, a foundational model of neural computation. Specifically, we investigate the relationship between the pattern load $α$ and the variable selection ratio $ρ$ for which a simple perceptron can perfectly classify $P = αN$ random patterns by optimally selecting $M = ρN$ variables out of $N$ variables. While the Cover--Gardner theory establishes that a random subset of $ρN$ dimensions can separate $αN$ random patterns if and only if $α< 2ρ$, we demonstrate that optimal variable selection can surpass this bound by developing a method, based on the replica method from statistical mechanics, for enumerating the combinations of variables that enable perfect pattern classification. This not only provides a quantitative criterion for distinguishing true structure in the data from spurious regularities, but also yields the storage capacity of associative memory models with sparse asymmetric couplings.

Storage capacity of perceptron with variable selection

TL;DR

Abstract

Storage capacity of perceptron with variable selection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)