Table of Contents
Fetching ...

Randomized Geometric Algebra Methods for Convex Neural Networks

Yifei Wang, Sungyoon Kim, Paul Chu, Indu Subramaniam, Mert Pilanci

TL;DR

The paper addresses training neural networks and transfer learning in hypercomplex spaces defined by Geometric/Clifford Algebra. It proposes randomized geometric algebra methods to efficiently compute generalized cross-products $\times(x_1,\dots,x_{d-1})$ and to recast two-layer ReLU networks $f^{\text{ReLU}}_{\theta}$ as convex optimization problems, aided by randomized embeddings and sketching. Key contributions include a GA-based convex reformulation, a randomized sampling algorithm for hyperplane arrangements, and empirical results showing faster training, improved training/test accuracy, and enhanced robustness on embedding-based NLP tasks using GPT-4 and BERT embeddings across IMDb, GLUE, and related datasets. The work highlights improved stability and potential for global optima in transfer learning with geometric algebra, with avenues for extending to LLM fine-tuning and GA-driven architectures.

Abstract

We introduce randomized algorithms to Clifford's Geometric Algebra, generalizing randomized linear algebra to hypercomplex vector spaces. This novel approach has many implications in machine learning, including training neural networks to global optimality via convex optimization. Additionally, we consider fine-tuning large language model (LLM) embeddings as a key application area, exploring the intersection of geometric algebra and modern AI techniques. In particular, we conduct a comparative analysis of the robustness of transfer learning via embeddings, such as OpenAI GPT models and BERT, using traditional methods versus our novel approach based on convex optimization. We test our convex optimization transfer learning method across a variety of case studies, employing different embeddings (GPT-4 and BERT embeddings) and different text classification datasets (IMDb, Amazon Polarity Dataset, and GLUE) with a range of hyperparameter settings. Our results demonstrate that convex optimization and geometric algebra not only enhances the performance of LLMs but also offers a more stable and reliable method of transfer learning via embeddings.

Randomized Geometric Algebra Methods for Convex Neural Networks

TL;DR

The paper addresses training neural networks and transfer learning in hypercomplex spaces defined by Geometric/Clifford Algebra. It proposes randomized geometric algebra methods to efficiently compute generalized cross-products and to recast two-layer ReLU networks as convex optimization problems, aided by randomized embeddings and sketching. Key contributions include a GA-based convex reformulation, a randomized sampling algorithm for hyperplane arrangements, and empirical results showing faster training, improved training/test accuracy, and enhanced robustness on embedding-based NLP tasks using GPT-4 and BERT embeddings across IMDb, GLUE, and related datasets. The work highlights improved stability and potential for global optima in transfer learning with geometric algebra, with avenues for extending to LLM fine-tuning and GA-driven architectures.

Abstract

We introduce randomized algorithms to Clifford's Geometric Algebra, generalizing randomized linear algebra to hypercomplex vector spaces. This novel approach has many implications in machine learning, including training neural networks to global optimality via convex optimization. Additionally, we consider fine-tuning large language model (LLM) embeddings as a key application area, exploring the intersection of geometric algebra and modern AI techniques. In particular, we conduct a comparative analysis of the robustness of transfer learning via embeddings, such as OpenAI GPT models and BERT, using traditional methods versus our novel approach based on convex optimization. We test our convex optimization transfer learning method across a variety of case studies, employing different embeddings (GPT-4 and BERT embeddings) and different text classification datasets (IMDb, Amazon Polarity Dataset, and GLUE) with a range of hyperparameter settings. Our results demonstrate that convex optimization and geometric algebra not only enhances the performance of LLMs but also offers a more stable and reliable method of transfer learning via embeddings.
Paper Structure (18 sections, 9 theorems, 44 equations, 17 figures, 4 algorithms)

This paper contains 18 sections, 9 theorems, 44 equations, 17 figures, 4 algorithms.

Key Result

Lemma 3.2

The regularization path of the optimal solution to train:noncvx with respect to the regularization parameter $\beta>0$ can be calculated by solving cvxnn:lasso.

Figures (17)

  • Figure 1: Decision regions from different variants of convex optimization based training. The triangles represent data points in the training set. The Convex Lasso method directly solves the convex lasso problem \ref{['cvxnn:lasso']}. The Convex Lasso subsampled method subsamples $200$ rows from the dictionary matrix $K$ in \ref{['cvxnn:lasso']} and solves the subsampled problem. The methods Geometric Algebra and Gaussian solve the convex optimization formulation \ref{['cvxnn:relu']} with $200$ subsampled hyperplane arrangement patterns with geometric algebra and Gaussian samples respectively. See the video demonstration https://anonymous.4open.science/r/CVXNN-randomized-GA-D400/video/full.mp4.
  • Figure : IMDB
  • Figure : IMDB
  • Figure : GLUE-COLA
  • Figure : IMDB
  • ...and 12 more figures

Theorems & Definitions (15)

  • Definition 3.1
  • Lemma 3.2
  • Proposition 3.3
  • Proposition 3.4
  • Theorem 3.5
  • Theorem 3.6
  • proof
  • Proposition B.1
  • proof
  • Proposition B.2
  • ...and 5 more