Table of Contents
Fetching ...

Associative memory and dead neurons

Vladimir Fanaskov, Ivan Oseledets

TL;DR

The paper shows that the Lyapunov energy used in Hopfield-style associative memory can develop non-compact flat directions when dead neurons occur, undermining stability analysis. It derives a Hessian-range-based approach to stability and introduces a modified conservative dynamics that eliminates flat directions, enabling a broad family of Lyapunov functions (including for non-symmetric weights). The key contributions include formalizing the dead-neuron flat-energy phenomenon, providing conditions for stability via Hessian projection, and constructing energy functions that support memory without dead-neuron artifacts. These results offer more robust energy-based frameworks for associative memory and connect to dense associative memory and non-symmetric architectures, with potential implications for memory capacity and resilience to perturbations.

Abstract

In "Large Associative Memory Problem in Neurobiology and Machine Learning," Dmitry Krotov and John Hopfield introduced a general technique for the systematic construction of neural ordinary differential equations with non-increasing energy or Lyapunov function. We study this energy function and identify that it is vulnerable to the problem of dead neurons. Each point in the state space where the neuron dies is contained in a non-compact region with constant energy. In these flat regions, energy function alone does not completely determine all degrees of freedom and, as a consequence, can not be used to analyze stability or find steady states or basins of attraction. We perform a direct analysis of the dynamical system and show how to resolve problems caused by flat directions corresponding to dead neurons: (i) all information about the state vector at a fixed point can be extracted from the energy and Hessian matrix (of Lagrange function), (ii) it is enough to analyze stability in the range of Hessian matrix, (iii) if steady state touching flat region is stable the whole flat region is the basin of attraction. The analysis of the Hessian matrix can be complicated for realistic architectures, so we show that for a slightly altered dynamical system (with the same structure of steady states), one can derive a diverse family of Lyapunov functions that do not have flat regions corresponding to dead neurons. In addition, these energy functions allow one to use Lagrange functions with Hessian matrices that are not necessarily positive definite and even consider architectures with non-symmetric feedforward and feedback connections.

Associative memory and dead neurons

TL;DR

The paper shows that the Lyapunov energy used in Hopfield-style associative memory can develop non-compact flat directions when dead neurons occur, undermining stability analysis. It derives a Hessian-range-based approach to stability and introduces a modified conservative dynamics that eliminates flat directions, enabling a broad family of Lyapunov functions (including for non-symmetric weights). The key contributions include formalizing the dead-neuron flat-energy phenomenon, providing conditions for stability via Hessian projection, and constructing energy functions that support memory without dead-neuron artifacts. These results offer more robust energy-based frameworks for associative memory and connect to dense associative memory and non-symmetric architectures, with potential implications for memory capacity and resilience to perturbations.

Abstract

In "Large Associative Memory Problem in Neurobiology and Machine Learning," Dmitry Krotov and John Hopfield introduced a general technique for the systematic construction of neural ordinary differential equations with non-increasing energy or Lyapunov function. We study this energy function and identify that it is vulnerable to the problem of dead neurons. Each point in the state space where the neuron dies is contained in a non-compact region with constant energy. In these flat regions, energy function alone does not completely determine all degrees of freedom and, as a consequence, can not be used to analyze stability or find steady states or basins of attraction. We perform a direct analysis of the dynamical system and show how to resolve problems caused by flat directions corresponding to dead neurons: (i) all information about the state vector at a fixed point can be extracted from the energy and Hessian matrix (of Lagrange function), (ii) it is enough to analyze stability in the range of Hessian matrix, (iii) if steady state touching flat region is stable the whole flat region is the basin of attraction. The analysis of the Hessian matrix can be complicated for realistic architectures, so we show that for a slightly altered dynamical system (with the same structure of steady states), one can derive a diverse family of Lyapunov functions that do not have flat regions corresponding to dead neurons. In addition, these energy functions allow one to use Lagrange functions with Hessian matrices that are not necessarily positive definite and even consider architectures with non-symmetric feedforward and feedback connections.

Paper Structure

This paper contains 13 sections, 5 theorems, 44 equations, 2 figures.

Key Result

Proposition 1

If at a given point $\boldsymbol{y}$ activation function $\boldsymbol{g}$ has $k$ dead neuron defined by $\boldsymbol{V}\in\mathbb{R}^{N\times k}$, energy function (eq:energy function) has constant value in a subspace $\mathcal{D} = \left\{\boldsymbol{y} + \boldsymbol{V}{\boldsymbol{c}}:\boldsymbol{

Figures (2)

  • Figure 1: Vector fields of dynamical systems (top row) and level sets of energy functions (bottom row) for energy functions (\ref{['eq:energy function']}) (the one from krotov2020large) and (\ref{['eq:E3 for original system']}) (proposed in this article). Vector fields suggest that all steady states are stable, but energy function (\ref{['eq:energy function']}) does not indicate that. The reason is energy function (\ref{['eq:energy function']}) has non-compact flat regions touching each point where one or more neurons die (see Proposition \ref{['prop:flat energy']} for precise statement).
  • Figure 2: Sketch of three problematic energy functions: (a) unbounded from below with two stable states (we analyze this situation in Appendix \ref{['appendix:bounded from below']}), (b) bounded from below but with compact flat region, (c) bounded from below but with non-compact flat region. According to Proposition \ref{['prop:flat energy']} case (c) is realized in models krotov2020large when neurons die. In red regions stability properties do not follow from Lyapunov theorems. LaSalle's invariance principle haddad2008nonlinear ensures that in case (b) isolated steady states are stable, but for the case (c) separate stability analysis is needed (see Section \ref{['subsection:stability']} for details).

Theorems & Definitions (13)

  • Example 1: MLP with feedback connections
  • Example 2: flat energy with ReLU activations
  • Example 3: flat energy with sigmoid activations
  • Example 4: flat energy with softmax activations
  • Definition : dead neurons
  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Proposition 5
  • ...and 3 more