Table of Contents
Fetching ...

Comparisons Are All You Need for Optimizing Smooth Functions

Chenyi Zhang, Tongyang Li

TL;DR

The paper shows that optimizing smooth functions using only a pairwise comparison oracle suffices to achieve near-optimal points in both convex and nonconvex settings. It introduces Comparison-GDE to estimate gradient directions from comparisons and leverages this in two convex-optimization pipelines (adaptive NGD and cutting planes) to match or surpass function-evaluation-based rates in the $n$-dependence. For nonconvex functions, it develops Comparison-NGD for first-order points and a saddle-point escape scheme (Comparison-PNGD with NCF subroutines) to reach $\epsilon$-second-order stationary points, with provable guarantees that scale as $n^{1.5}$ and polylog factors. Overall, the results demonstrate that directional information from comparisons is enough to drive derivative-free optimization with competitive query complexities, with potential implications for preference-based RL and quantization in neural networks.

Abstract

When optimizing machine learning models, there are various scenarios where gradient computations are challenging or even infeasible. Furthermore, in reinforcement learning (RL), preference-based RL that only compares between options has wide applications, including reinforcement learning with human feedback in large language models. In this paper, we systematically study optimization of a smooth function $f\colon\mathbb{R}^n\to\mathbb{R}$ only assuming an oracle that compares function values at two points and tells which is larger. When $f$ is convex, we give two algorithms using $\tilde{O}(n/ε)$ and $\tilde{O}(n^{2})$ comparison queries to find an $ε$-optimal solution, respectively. When $f$ is nonconvex, our algorithm uses $\tilde{O}(n/ε^2)$ comparison queries to find an $ε$-approximate stationary point. All these results match the best-known zeroth-order algorithms with function evaluation queries in $n$ dependence, thus suggest that \emph{comparisons are all you need for optimizing smooth functions using derivative-free methods}. In addition, we also give an algorithm for escaping saddle points and reaching an $ε$-second order stationary point of a nonconvex $f$, using $\tilde{O}(n^{1.5}/ε^{2.5})$ comparison queries.

Comparisons Are All You Need for Optimizing Smooth Functions

TL;DR

The paper shows that optimizing smooth functions using only a pairwise comparison oracle suffices to achieve near-optimal points in both convex and nonconvex settings. It introduces Comparison-GDE to estimate gradient directions from comparisons and leverages this in two convex-optimization pipelines (adaptive NGD and cutting planes) to match or surpass function-evaluation-based rates in the -dependence. For nonconvex functions, it develops Comparison-NGD for first-order points and a saddle-point escape scheme (Comparison-PNGD with NCF subroutines) to reach -second-order stationary points, with provable guarantees that scale as and polylog factors. Overall, the results demonstrate that directional information from comparisons is enough to drive derivative-free optimization with competitive query complexities, with potential implications for preference-based RL and quantization in neural networks.

Abstract

When optimizing machine learning models, there are various scenarios where gradient computations are challenging or even infeasible. Furthermore, in reinforcement learning (RL), preference-based RL that only compares between options has wide applications, including reinforcement learning with human feedback in large language models. In this paper, we systematically study optimization of a smooth function only assuming an oracle that compares function values at two points and tells which is larger. When is convex, we give two algorithms using and comparison queries to find an -optimal solution, respectively. When is nonconvex, our algorithm uses comparison queries to find an -approximate stationary point. All these results match the best-known zeroth-order algorithms with function evaluation queries in dependence, thus suggest that \emph{comparisons are all you need for optimizing smooth functions using derivative-free methods}. In addition, we also give an algorithm for escaping saddle points and reaching an -second order stationary point of a nonconvex , using comparison queries.
Paper Structure (23 sections, 24 theorems, 205 equations, 1 figure)

This paper contains 23 sections, 24 theorems, 205 equations, 1 figure.

Key Result

Lemma 1

Given a point $\mathbf{x}\in\mathbb{R}^{n}$, a unit vector $\mathbf{v}\in\mathbb{B}_1(0)$, and precision $\Delta>0$ for directional preference. Then algo:DP is correct:

Figures (1)

  • Figure 1: The intuition of \ref{['algo:Comparison-Triangle']} for computing Hessian-vector products using gradient directions.

Theorems & Definitions (46)

  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • Lemma 2
  • proof : Proof of \ref{['thm:CCO']}
  • Lemma 3: Theorem 1.1, jiang2020improved
  • Theorem 3
  • proof : Proof of \ref{['thm:cutting-plane']}
  • ...and 36 more