Mapping Networks

Lord Sen; Shyamapada Mukherjee

Mapping Networks

Lord Sen, Shyamapada Mukherjee

TL;DR

The Mapping Theorem enforced by a dedicated Mapping Loss, shows the existence of a mapping from this latent space to the target weight space both theoretically and in practice.

Abstract

The escalating parameter counts in modern deep learning models pose a fundamental challenge to efficient training and resolution of overfitting. We address this by introducing the \emph{Mapping Networks} which replace the high dimensional weight space by a compact, trainable latent vector based on the hypothesis that the trained parameters of large networks reside on smooth, low-dimensional manifolds. Henceforth, the Mapping Theorem enforced by a dedicated Mapping Loss, shows the existence of a mapping from this latent space to the target weight space both theoretically and in practice. Mapping Networks significantly reduce overfitting and achieve comparable to better performance than target network across complex vision and sequence tasks, including Image Classification, Deepfake Detection etc, with $\mathbf{99.5\%}$, i.e., around $500\times$ reduction in trainable parameters.

Mapping Networks

TL;DR

The Mapping Theorem enforced by a dedicated Mapping Loss, shows the existence of a mapping from this latent space to the target weight space both theoretically and in practice.

Abstract

, i.e., around

reduction in trainable parameters.

Paper Structure (26 sections, 30 equations, 5 figures, 8 tables)

This paper contains 26 sections, 30 equations, 5 figures, 8 tables.

Introduction
Methodology
Weight–Manifold Hypothesis:
Mapping Theorem and Practical Corollary
Mapping Network
Trainable Latent Vector
Mapping Network with weight modulation
Mapping to Network's Parameters
Target Network for feedforward and Inference
Architecture Add-Ons
Extension to Fine Tuning
Mapping Loss
Training
Single Latent Vector Training (SLVT)
Layer wise Training (LWT)
...and 11 more sections

Figures (5)

Figure 1: State of the Existing Works and Ours in this field
Figure 2: Parameter update snapshots showing distinct parameter manifolds in CNN evolution.
Figure 3: General Architecture for Mapping Networks.
Figure 4: Process of modulation of Mapping weights and training of latent vector z from epoch p to p+1.
Figure 5: Training strategies used for Mapping Network

Mapping Networks

TL;DR

Abstract

Mapping Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (5)