Optimization on black-box function by parameter-shift rule
Vu Tuan Hai
TL;DR
The paper tackles black-box optimization where the parameter–outcome relationship is opaque and traditional gradient access is unavailable. It adapts the parameter-shift rule (PSR), originally from quantum computing, into a zeroth-order gradient estimation method to reduce query counts and achieve favorable computational scaling. The authors apply the approach to a perceptron and to simple nonlinear functions, demonstrating high-fidelity gradient estimates that closely match analytic gradients. They discuss strategies for selecting PSR parameters $(r,\epsilon)$, including grid-search and potential one-dimensional reductions when $r=h(\epsilon)$, and outline future work for broader practical deployment.
Abstract
Machine learning has been widely applied in many aspects, but training a machine learning model is increasingly difficult. There are more optimization problems named "black-box" where the relationship between model parameters and outcomes is uncertain or complex to trace. Currently, optimizing black-box models that need a large number of query observations and parameters becomes difficult. To overcome the drawbacks of the existing algorithms, in this study, we propose a zeroth-order method that originally came from quantum computing called the parameter-shift rule, which has used a lesser number of parameters than previous methods.
