Table of Contents
Fetching ...

Negative Feedback System as Optimizer for Machine Learning Systems

Md Munir Hasan, Jeremy Holleman

TL;DR

The paper addresses optimizing neural networks by viewing training as a physical inverse problem realized via a negative feedback loop. It formalizes a forward function with high gain $A$, whose inverse is effectively realized in the backward path, and derives error propagation that matches backpropagation under squared error minimization, while enabling learning of certain non-differentiable activations via windowing. It further shows that many standard optimization techniques (e.g., weight decay, adaptive momentum, residual connections) arise naturally from this framework and discusses implications for biological plausibility and analog hardware. The proposed perspective offers a unifying, physics-inspired view of neural network optimization, providing new intuition for training dynamics and potential solutions to open problems such as dying ReLU and vanishing gradients. The work suggests practical pathways toward low-power optimization hardware and deeper theoretical connections between optimization and physical processes.

Abstract

With high forward gain, a negative feedback system has the ability to perform the inverse of a linear or non-linear function that is in the feedback path. This property of negative feedback systems has been widely used in analog electronic circuits to construct precise closed-loop functions. This paper describes how the function-inverting process of a negative feedback system serves as a physical analogy of the optimization technique in machine learning. We show that this process is able to learn some non-differentiable functions in cases where a gradient descent-based method fails. We also show that the optimization process reduces to gradient descent under the constraint of squared error minimization. We derive the backpropagation technique and other known optimization techniques of deep networks from the properties of negative feedback system independently of the gradient descent method. This analysis provides a novel view of neural network optimization and may provide new insights on open problems.

Negative Feedback System as Optimizer for Machine Learning Systems

TL;DR

The paper addresses optimizing neural networks by viewing training as a physical inverse problem realized via a negative feedback loop. It formalizes a forward function with high gain , whose inverse is effectively realized in the backward path, and derives error propagation that matches backpropagation under squared error minimization, while enabling learning of certain non-differentiable activations via windowing. It further shows that many standard optimization techniques (e.g., weight decay, adaptive momentum, residual connections) arise naturally from this framework and discusses implications for biological plausibility and analog hardware. The proposed perspective offers a unifying, physics-inspired view of neural network optimization, providing new intuition for training dynamics and potential solutions to open problems such as dying ReLU and vanishing gradients. The work suggests practical pathways toward low-power optimization hardware and deeper theoretical connections between optimization and physical processes.

Abstract

With high forward gain, a negative feedback system has the ability to perform the inverse of a linear or non-linear function that is in the feedback path. This property of negative feedback systems has been widely used in analog electronic circuits to construct precise closed-loop functions. This paper describes how the function-inverting process of a negative feedback system serves as a physical analogy of the optimization technique in machine learning. We show that this process is able to learn some non-differentiable functions in cases where a gradient descent-based method fails. We also show that the optimization process reduces to gradient descent under the constraint of squared error minimization. We derive the backpropagation technique and other known optimization techniques of deep networks from the properties of negative feedback system independently of the gradient descent method. This analysis provides a novel view of neural network optimization and may provide new insights on open problems.

Paper Structure

This paper contains 19 sections, 8 equations, 7 figures.

Figures (7)

  • Figure 1: (a) A generic negative feedback system (b) An Operational amplifier with an exponential element in the feedback path realizes a logarithmic input-output function. The transistor Q has exponential voltage to current relationship. The feedback system implements inverse of the exponential i.e. logarithmic function.
  • Figure 2: A negative feedback system as optimizer for machine learning system.
  • Figure 3: Torque analogy of error. The solid and dashed lines represent $y$ and $y'$ respectively. The difference $y-y'$ shown by the dotted arrows can be thought of forces acting on the x-axis. The resulting total torque $\sum_{k}(y^{[k]}-y'^{[k]})x^{[k]}$ is the output of the error function given by Eq. (\ref{['eq:multi-var-error']}).
  • Figure 4: (a) Target data $y=\sigma(wx+b)$ for $\sigma = sgn(z+1)+sgn(z)+sgn(z-1)$ where $z=wx+b$. The plot is shown for $w=1, b=-1$ (b) Average squared error during training for negative feedback system and SGD w.r.t training iteration. (c) Illustrating the failure of squared error minimization.
  • Figure 5: Backpropagating the difference vector to previous layers.
  • ...and 2 more figures