Minimum number of neurons in fully connected layers of a given neural network (the first approximation)
Oleg I. Berngardt
TL;DR
The paper tackles the problem of determining the minimal width of fully connected layers for a neural network solving a given task, without performing multiple full trainings for different widths. It introduces a method that starts from a wide, cross-validated network and uses a truncated SVD autoencoder inserted after each studied layer to probe the layer's latent dimension, with the minimum width $M$ tied to the rank of the layer's output matrix $Y^{(n)}$ and validated via statistical equivalence. Experiments on MNIST variants and other datasets show that the identified minima can closely match original performance while being substantially smaller than universal bounds, though the approach is stochastic and does not guarantee trainability. Overall, the work provides a first approximation for per-layer neuron count that behaves as an intrinsic property of the solution, offering a potential pathway for lightweight architecture optimization and compression.
Abstract
This paper presents an algorithm for searching for the minimum number of neurons in fully connected layers of an arbitrary network solving given problem, which does not require multiple training of the network with different number of neurons. The algorithm is based at training the initial wide network using the cross-validation method over at least two folds. Then by using truncated singular value decomposition autoencoder inserted after the studied layer of trained network we search the minimum number of neurons in inference only mode of the network. It is shown that the minimum number of neurons in a fully connected layer could be interpreted not as network hyperparameter associated with the other hyperparameters of the network, but as internal (latent) property of the solution, determined by the network architecture, the training dataset, layer position, and the quality metric used. So the minimum number of neurons can be estimated for each hidden fully connected layer independently. The proposed algorithm is the first approximation for estimating the minimum number of neurons in the layer, since, on the one hand, the algorithm does not guarantee that a neural network with the found number of neurons can be trained to the required quality, and on the other hand, it searches for the minimum number of neurons in a limited class of possible solutions. The solution was tested on several datasets in classification and regression problems.
