Table of Contents
Fetching ...

ESS-ReduNet: Enhancing Subspace Separability of ReduNet via Dynamic Expansion with Bayesian Inference

Xiaojie Yu, Haibo Zhang, Lizhi Peng, Fengyang Sun, Jeremiah Deng

TL;DR

ESS-ReduNet is presented to enhance the separability of each category's subspace by dynamically controlling the expansion of the overall spanned space of the samples, and label knowledge is incorporated with Bayesian inference to encourage the decoupling of subspaces.

Abstract

ReduNet is a deep neural network model that leverages the principle of maximal coding rate \textbf{redu}ction to transform original data samples into a low-dimensional, linear discriminative feature representation. Unlike traditional deep learning frameworks, ReduNet constructs its parameters explicitly layer by layer, with each layer's parameters derived based on the features transformed from the preceding layer. Rather than directly using labels, ReduNet uses the similarity between each category's spanned subspace and the data samples for feature updates at each layer. This may lead to features being updated in the wrong direction, impairing the correct construction of network parameters and reducing the network's convergence speed. To address this issue, based on the geometric interpretation of the network parameters, this paper presents ESS-ReduNet to enhance the separability of each category's subspace by dynamically controlling the expansion of the overall spanned space of the samples. Meanwhile, label knowledge is incorporated with Bayesian inference to encourage the decoupling of subspaces. Finally, stability, as assessed by the condition number, serves as an auxiliary criterion for halting training. Experiments on the ESR, HAR, Covertype, and Gas datasets demonstrate that ESS-ReduNet achieves more than 10x improvement in convergence compared to ReduNet. Notably, on the ESR dataset, the features transformed by ESS-ReduNet achieve a 47\% improvement in SVM classification accuracy.

ESS-ReduNet: Enhancing Subspace Separability of ReduNet via Dynamic Expansion with Bayesian Inference

TL;DR

ESS-ReduNet is presented to enhance the separability of each category's subspace by dynamically controlling the expansion of the overall spanned space of the samples, and label knowledge is incorporated with Bayesian inference to encourage the decoupling of subspaces.

Abstract

ReduNet is a deep neural network model that leverages the principle of maximal coding rate \textbf{redu}ction to transform original data samples into a low-dimensional, linear discriminative feature representation. Unlike traditional deep learning frameworks, ReduNet constructs its parameters explicitly layer by layer, with each layer's parameters derived based on the features transformed from the preceding layer. Rather than directly using labels, ReduNet uses the similarity between each category's spanned subspace and the data samples for feature updates at each layer. This may lead to features being updated in the wrong direction, impairing the correct construction of network parameters and reducing the network's convergence speed. To address this issue, based on the geometric interpretation of the network parameters, this paper presents ESS-ReduNet to enhance the separability of each category's subspace by dynamically controlling the expansion of the overall spanned space of the samples. Meanwhile, label knowledge is incorporated with Bayesian inference to encourage the decoupling of subspaces. Finally, stability, as assessed by the condition number, serves as an auxiliary criterion for halting training. Experiments on the ESR, HAR, Covertype, and Gas datasets demonstrate that ESS-ReduNet achieves more than 10x improvement in convergence compared to ReduNet. Notably, on the ESR dataset, the features transformed by ESS-ReduNet achieve a 47\% improvement in SVM classification accuracy.

Paper Structure

This paper contains 28 sections, 19 equations, 34 figures, 1 table, 2 algorithms.

Figures (34)

  • Figure 1: The Vicious Cycle of ReduNet. $S_0$ and $S_1$ are spanned subspaces of class $0$ and class $1$, respectively. $z \rightarrow S_0$ means sample $z$ updating towards $S_0$.
  • Figure 2: A Layer of ReduNet
  • Figure 3: (a): The number of misclassified labels of $\{\boldsymbol{\hat{{\pi}}^j}\}^k_{j=1}$ ; (b): Objective function curve ; (c): Rank trend ; (d): Condition number trend.
  • Figure 4: Overview of ESS-ReduNet
  • Figure 5: The Geometric Interpretation of Least Squares
  • ...and 29 more figures