RedEx: Beyond Fixed Representation Methods via Convex Optimization

Amit Daniely; Mariano Schain; Gilad Yehudai

RedEx: Beyond Fixed Representation Methods via Convex Optimization

Amit Daniely, Mariano Schain, Gilad Yehudai

TL;DR

This work addresses the gap between the optimization guarantees of fixed representations and the expressive power of neural networks by introducing RedEx, a Reduced Extractor Expander architecture. RedEx can match neural networks in expressiveness while enabling layerwise training via convex semidefinite programs, providing provable guarantees without assumptions on the input distribution. A key contribution is a separation result showing RedEx can efficiently learn a family of functions that fixed representation methods cannot, along with an efficient training algorithm and extensions to multi-layer and convolutional forms. The norm-based formulation and convolutional extensions further broaden applicability, suggesting a path toward scalable, provably learnable representation learning outside standard SDP frameworks.

Abstract

Optimizing Neural networks is a difficult task which is still not well understood. On the other hand, fixed representation methods such as kernels and random features have provable optimization guarantees but inferior performance due to their inherent inability to learn the representations. In this paper, we aim at bridging this gap by presenting a novel architecture called RedEx (Reduced Expander Extractor) that is as expressive as neural networks and can also be trained in a layer-wise fashion via a convex program with semi-definite constraints and optimization guarantees. We also show that RedEx provably surpasses fixed representation methods, in the sense that it can efficiently learn a family of target functions which fixed representation methods cannot.

RedEx: Beyond Fixed Representation Methods via Convex Optimization

TL;DR

Abstract

Paper Structure (23 sections, 13 theorems, 54 equations, 5 algorithms)

This paper contains 23 sections, 13 theorems, 54 equations, 5 algorithms.

Introduction
Related Works
Fixed representation methods and NTK.
Limitations of fixed representation methods.
Provable optimization beyond fixed representations.
Notations and Settings
Reduced Extractor-Expanders (RedEx)
Efficient and Provable Learnability of RedEx
Layerwise RedEx surpasses Kernel Methods
On the proof of theorem \ref{['thm:main_layerwise']}
Extensions and Discussion
Norm Formulation of RedEx and Relation to Trace norm
Convolutions
Conclusions and Future Work
Proofs from Section \ref{['Sec:RedEx']}
...and 8 more sections

Key Result

Theorem 3.2

Let $B:\{0,1\}^d\rightarrow\{0,1\}$ be a function computed by a Boolean circuit of size $T$. Then we can define a RedEx with depth $O(T)$ and intermediate feature dimension at most $O(T^2)$ that computes $B$.

Theorems & Definitions (26)

Definition 3.1: RedEx - Reduced Extractor-Expander
Theorem 3.2
Theorem 4.1
Remark 4.2
Remark 4.3
Theorem 4.4
Theorem 5.1
Theorem 5.2
Definition 6.1
Lemma 6.2
...and 16 more

RedEx: Beyond Fixed Representation Methods via Convex Optimization

TL;DR

Abstract

RedEx: Beyond Fixed Representation Methods via Convex Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (26)