Implicit Bias in Deep Linear Discriminant Analysis

Jiawen Li

Implicit Bias in Deep Linear Discriminant Analysis

Jiawen Li

TL;DR

By analyzing the gradient flow of the loss on a L-layer diagonal linear network, it is proved that under balanced initialization, the network architecture transforms standard additive gradient updates into multiplicative weight updates, which demonstrates an automatic conservation of the (2/L) quasi-norm.

Abstract

While the Implicit Bias(or Implicit Regularization) of standard loss functions has been studied, the optimization geometry induced by discriminative metric-learning objectives remains largely unexplored.To the best of our knowledge, this paper presents an initial theoretical analysis of the implicit regularization induced by the Deep LDA,a scale invariant objective designed to minimize intraclass variance and maximize interclass distance. By analyzing the gradient flow of the loss on a L-layer diagonal linear network, we prove that under balanced initialization, the network architecture transforms standard additive gradient updates into multiplicative weight updates, which demonstrates an automatic conservation of the (2/L) quasi-norm.

Implicit Bias in Deep Linear Discriminant Analysis

TL;DR

Abstract

Paper Structure (11 sections, 29 equations, 2 figures)

This paper contains 11 sections, 29 equations, 2 figures.

Introduction
Literature Review
Methodology
Diagonal Linear Networks(DLNs)
Conservation Lemma for DLNs
Implicit Bias in Deep LDA
Experiment
Conclusion
Proof of the Orthogonality Lemma
Gradient of the Rayleigh Quotient
Scale Invariance Property

Figures (2)

Figure 1: The Optimization in Simplex and the Implicit Bias
Figure 2: The Simulated Result for DeepLDA in DLNs.

Implicit Bias in Deep Linear Discriminant Analysis

TL;DR

Abstract

Implicit Bias in Deep Linear Discriminant Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (2)