Nonsmooth Optimisation and neural networks
Vinesha Peiris, Nadezda Sukhorukova
TL;DR
This work addresses training a neural network under a uniform deviation loss by formulating it as a nonsmooth optimisation problem for a single-layer network and analysing it with quasidifferential calculus. It derives inf-stationary conditions and proposes a step-by-step bisection algorithm to build the affine pieces, proving global minimisation for the quasiconvex subproblems when adding pieces. The study covers both smooth and Leaky ReLU activations, and demonstrates the approach on TwoLead-ECG datasets, showing finite convergence to near-optimal solutions. The results provide a principled framework for exact minimisation in this restricted NN setting and point to avenues for extension to deeper architectures and alternative activations, along with efficiency improvements.
Abstract
In this paper, we study neural networks from the point of view of nonsmooth optimisation, namely, quasidifferential calculus. We restrict ourselves to the case of uniform approximation by a neural network without hidden layers, the activation functions are restricted to continuous strictly increasing functions. We develop an algorithm for computing the approximation with one hidden layer through a step-by-step procedure. The nonsmooth analysis techniques demonstrated their efficiency. In particular, they partially explain why the developed step-by-step procedure may run without any objective function improvement after just one step of the procedure.
