Optimal feature rescaling in machine learning based on neural networks

Federico Maria Vitrò; Marco Leonesio; Lorenzo Fagiano

Optimal feature rescaling in machine learning based on neural networks

Federico Maria Vitrò, Marco Leonesio, Lorenzo Fagiano

TL;DR

The paper addresses improving FFNN training efficiency and generalization by optimizing input feature scales. It introduces Optimal Feature Rescaling (OFR), where a Genetic Algorithm searches the scaling vector $s$ that preconditions inputs and effectively modulates the first-layer weights, while the network is trained to minimize $MAE$ on scaled data. The outer objective uses $RMSE$ on a validation set to select the best scaling, addressing non-convexity and aiming for a more global minimum. Empirical evaluation on a centerless grinding regression task shows OFR outperforms standardization, albeit with higher training time, validating a practical, albeit time-costly, global-search-assisted preconditioning technique for industrial NN applications.

Abstract

This paper proposes a novel approach to improve the training efficiency and the generalization performance of Feed Forward Neural Networks (FFNNs) resorting to an optimal rescaling of input features (OFR) carried out by a Genetic Algorithm (GA). The OFR reshapes the input space improving the conditioning of the gradient-based algorithm used for the training. Moreover, the scale factors exploration entailed by GA trials and selection corresponds to different initialization of the first layer weights at each training attempt, thus realizing a multi-start global search algorithm (even though restrained to few weights only) which fosters the achievement of a global minimum. The approach has been tested on a FFNN modeling the outcome of a real industrial process (centerless grinding).

Optimal feature rescaling in machine learning based on neural networks

TL;DR

that preconditions inputs and effectively modulates the first-layer weights, while the network is trained to minimize

on scaled data. The outer objective uses

on a validation set to select the best scaling, addressing non-convexity and aiming for a more global minimum. Empirical evaluation on a centerless grinding regression task shows OFR outperforms standardization, albeit with higher training time, validating a practical, albeit time-costly, global-search-assisted preconditioning technique for industrial NN applications.

Abstract

Paper Structure (12 sections, 7 equations, 1 figure, 8 tables)

This paper contains 12 sections, 7 equations, 1 figure, 8 tables.

INTRODUCTION
PROPOSED METHOD: OPTIMAL FEATURE RESCALING
Optimization problem
Genetic Algorithm
CASE STUDY
Dataset generation
Test 1: OFR application
Test 2: OFR application with ES
Test 3: OFR effects on a simplified FFNN with ES
Genetic Algorithm computational time
Efficiency analysis
CONCLUSIONS

Figures (1)

Figure 1: Centerless grinding process

Optimal feature rescaling in machine learning based on neural networks

TL;DR

Abstract

Optimal feature rescaling in machine learning based on neural networks

Authors

TL;DR

Abstract

Table of Contents

Figures (1)