Table of Contents
Fetching ...

Adaptable Embeddings Network (AEN)

Stan Loosmore, Alexander Titus

TL;DR

Adaptable Embeddings Networks (AEN) is introduced, a novel dual-encoder architecture using Kernel Density Estimation (KDE) that allows for runtime adaptation of classification criteria without retraining and is non-autoregressive.

Abstract

Modern day Language Models see extensive use in text classification, yet this comes at significant computational cost. Compute-effective classification models are needed for low-resource environments, most notably on edge devices. We introduce Adaptable Embeddings Networks (AEN), a novel dual-encoder architecture using Kernel Density Estimation (KDE). This architecture allows for runtime adaptation of classification criteria without retraining and is non-autoregressive. Through thorough synthetic data experimentation, we demonstrate our model outputs comparable and in certain cases superior results to that of autoregressive models an order of magnitude larger than AEN's size. The architecture's ability to preprocess and cache condition embeddings makes it ideal for edge computing applications and real-time monitoring systems.

Adaptable Embeddings Network (AEN)

TL;DR

Adaptable Embeddings Networks (AEN) is introduced, a novel dual-encoder architecture using Kernel Density Estimation (KDE) that allows for runtime adaptation of classification criteria without retraining and is non-autoregressive.

Abstract

Modern day Language Models see extensive use in text classification, yet this comes at significant computational cost. Compute-effective classification models are needed for low-resource environments, most notably on edge devices. We introduce Adaptable Embeddings Networks (AEN), a novel dual-encoder architecture using Kernel Density Estimation (KDE). This architecture allows for runtime adaptation of classification criteria without retraining and is non-autoregressive. Through thorough synthetic data experimentation, we demonstrate our model outputs comparable and in certain cases superior results to that of autoregressive models an order of magnitude larger than AEN's size. The architecture's ability to preprocess and cache condition embeddings makes it ideal for edge computing applications and real-time monitoring systems.

Paper Structure

This paper contains 58 sections, 7 equations, 4 figures, 11 tables.

Figures (4)

  • Figure 1: 1D Kernel Density Estimation showing individual kernels (dashed lines) and their sum (solid line)
  • Figure 2: Data Generation Process
  • Figure 3: AEN General training architecture
  • Figure 4: AEN runtime architecture for constant data evaluation