Optimal feature rescaling in machine learning based on neural networks
Federico Maria Vitrò, Marco Leonesio, Lorenzo Fagiano
TL;DR
The paper addresses improving FFNN training efficiency and generalization by optimizing input feature scales. It introduces Optimal Feature Rescaling (OFR), where a Genetic Algorithm searches the scaling vector $s$ that preconditions inputs and effectively modulates the first-layer weights, while the network is trained to minimize $MAE$ on scaled data. The outer objective uses $RMSE$ on a validation set to select the best scaling, addressing non-convexity and aiming for a more global minimum. Empirical evaluation on a centerless grinding regression task shows OFR outperforms standardization, albeit with higher training time, validating a practical, albeit time-costly, global-search-assisted preconditioning technique for industrial NN applications.
Abstract
This paper proposes a novel approach to improve the training efficiency and the generalization performance of Feed Forward Neural Networks (FFNNs) resorting to an optimal rescaling of input features (OFR) carried out by a Genetic Algorithm (GA). The OFR reshapes the input space improving the conditioning of the gradient-based algorithm used for the training. Moreover, the scale factors exploration entailed by GA trials and selection corresponds to different initialization of the first layer weights at each training attempt, thus realizing a multi-start global search algorithm (even though restrained to few weights only) which fosters the achievement of a global minimum. The approach has been tested on a FFNN modeling the outcome of a real industrial process (centerless grinding).
