Table of Contents
Fetching ...

On De-Individuated Neurons: Continuous Symmetries Enable Dynamic Topologies

George Bird

TL;DR

This paper introduces a novel methodology for dynamic networks by leveraging a new symmetry-principled class of primitives, isotropic activation functions, which enables real-time neuronal growth and shrinkage of the architectures in response to task demand.

Abstract

This paper introduces a novel methodology for dynamic networks by leveraging a new symmetry-principled class of primitives, isotropic activation functions. This approach enables real-time neuronal growth and shrinkage of the architectures in response to task demand. This is made possible by network structural changes that are invariant under symmetry reparameterisations, leaving the computation identical under neurogenesis and well approximated under neurodegeneration. This is undertaken by leveraging the isotropic primitives' property of basis independence, resulting in the loss of the individuated neurons implicit in the elementwise functional form. Isotropy thereby allows a freedom in the basis to which layers are decomposed and interpreted as individual artificial neurons. This enables a layer-wise diagonalisation procedure, in which typical interconnected layers, such as dense layers, convolutional kernels, and others, can be reexpressed so that neurons have one-to-one, ordered connectivity within alternating layers. This indicates which one-to-one neuron-to-neuron communications are strongly impactful on overall functionality and which are not. Inconsequential neurons can thus be removed (neurodegeneration), and new inactive scaffold neurons added (neurogenesis) whilst remaining analytically invariant in function. A new tunable model parameter, intrinsic length, is also introduced to ensure this analytical invariance. This approach mathematically equates connectivity pruning with neurodegeneration. The diagonalisation also offers new possibilities for mechanistic interpretability into isotropic networks, and it is demonstrated that isotropic dense networks can asymptotically reach a sparsity factor of 50% whilst retaining exact network functionality. Finally, the construction is generalised, demonstrating a nested functional class for this form of isotropic primitive architectures.

On De-Individuated Neurons: Continuous Symmetries Enable Dynamic Topologies

TL;DR

This paper introduces a novel methodology for dynamic networks by leveraging a new symmetry-principled class of primitives, isotropic activation functions, which enables real-time neuronal growth and shrinkage of the architectures in response to task demand.

Abstract

This paper introduces a novel methodology for dynamic networks by leveraging a new symmetry-principled class of primitives, isotropic activation functions. This approach enables real-time neuronal growth and shrinkage of the architectures in response to task demand. This is made possible by network structural changes that are invariant under symmetry reparameterisations, leaving the computation identical under neurogenesis and well approximated under neurodegeneration. This is undertaken by leveraging the isotropic primitives' property of basis independence, resulting in the loss of the individuated neurons implicit in the elementwise functional form. Isotropy thereby allows a freedom in the basis to which layers are decomposed and interpreted as individual artificial neurons. This enables a layer-wise diagonalisation procedure, in which typical interconnected layers, such as dense layers, convolutional kernels, and others, can be reexpressed so that neurons have one-to-one, ordered connectivity within alternating layers. This indicates which one-to-one neuron-to-neuron communications are strongly impactful on overall functionality and which are not. Inconsequential neurons can thus be removed (neurodegeneration), and new inactive scaffold neurons added (neurogenesis) whilst remaining analytically invariant in function. A new tunable model parameter, intrinsic length, is also introduced to ensure this analytical invariance. This approach mathematically equates connectivity pruning with neurodegeneration. The diagonalisation also offers new possibilities for mechanistic interpretability into isotropic networks, and it is demonstrated that isotropic dense networks can asymptotically reach a sparsity factor of 50% whilst retaining exact network functionality. Finally, the construction is generalised, demonstrating a nested functional class for this form of isotropic primitive architectures.
Paper Structure (24 sections, 56 equations, 5 figures)

This paper contains 24 sections, 56 equations, 5 figures.

Figures (5)

  • Figure 1: This illustration depicts the qualitative effects on a network from full diagonalisation --- the chosen layer has a double-sided basis change to the connectivity, reducing the map to a one-to-one correspondence between neurons. This drastically simplifies the interrelations between the chosen layer, allowing the application of the dynamic network implementation. In general, for sequential layers, only one layer can be diagonalised at a time, as diagonalising one layer often destroys the diagonalised state of the immediately preceding and following layers in the process. Layers which are interspaced by other affine transforms can be concurrently diagonalised.
  • Figure 2: This illustration depicts the qualitative effects on a network from partial diagonalisation. This time, the chosen layer has a single-sided basis change to the connectivity. Depending on the transform side, left- or right-sided, an initial or later mixing of connectivities occurs, followed by scaling by the singular values. This setup may be more convenient to implement than a full diagonalisation. Some interpretability and explanatory convenience are lost due to the remaining connectivity mixing.
  • Figure 3: All plots demonstrate accuracy on CIFAR10 classification of pretrained multilayer perceptron networks, which go through an additional $48$ epochs of training as their architecture is dynamically adapted. Top-left shows networks with an initial width of $8$ hidden layer neurons, going through neurogenesis to 8, 16, 24 or 32 width layers. Similarly, the top-left plot demonstrates results beginning at $16$ neurons, the bottom-left at $24$ neurons and the bottom-right at $32$. The larger networks undergo neurogenesis, no change, or neurodegeneration depending on whether the width increases, remains the same or decreases, respectively.
  • Figure 4: All plots demonstrate accuracy on CIFAR10 classification of pretrained multilayer perceptron networks, which go through an additional $48$ epochs of training as their architecture is dynamically adapted. The top-left shows networks with a final width of $8$ hidden layer neurons, going through neurodegeneration from 8, 16, 24 or 32 width layers. Similarly, the top-left plot demonstrates results finishing with $16$ neurons, the bottom-left at $24$ neurons and the bottom-right at $32$.
  • Figure 5: Displays, identical plots to Fig.\ref{['Fig:ResultsOne']} in the same layout, but also displays the $24$ epoch pretraining and the anisotropic tanh control network.