Random Linear Projections Loss for Hyperplane-Based Optimization in Neural Networks
Shyam Venkatasubramanian, Ahmed Aloui, Vahid Tarokh
TL;DR
This work introduces Random Linear Projections (RLP) loss, a hyperplane-based, non-local objective that minimizes the distance between regression hyperplanes derived from fixed-size subsets of features and labels. The authors prove that the RLP optimizer targets the conditional expectation $h(x)=\mathbb{E}[Y|X=x]$ and show faster convergence than MSE under suitable assumptions, presenting a two-step algorithm: balanced batch generation and RLP-based training. Empirically, RLP improves performance across regression, image reconstruction, and classification tasks (e.g., California Housing, MNIST, CIFAR-10), demonstrating faster convergence, better generalization, and robustness to limited data, distribution shifts, and additive noise; they also explore mixup variants. A key caveat is the computational cost from matrix inversions at each step, highlighting directions for scalable optimization and further theoretical development to solidify the statistical properties of RLP losses.
Abstract
Advancing loss function design is pivotal for optimizing neural network training and performance. This work introduces Random Linear Projections (RLP) loss, a novel approach that enhances training efficiency by leveraging geometric relationships within the data. Distinct from traditional loss functions that target minimizing pointwise errors, RLP loss operates by minimizing the distance between sets of hyperplanes connecting fixed-size subsets of feature-prediction pairs and feature-label pairs. Our empirical evaluations, conducted across benchmark datasets and synthetic examples, demonstrate that neural networks trained with RLP loss outperform those trained with traditional loss functions, achieving improved performance with fewer data samples, and exhibiting greater robustness to additive noise. We provide theoretical analysis supporting our empirical findings.
