Table of Contents
Fetching ...

Multiple Scale Methods For Optimization Of Discretized Continuous Functions

Nicholas J. E. Richardson, Noah Marusenko, Michael P. Friedlander

TL;DR

This work addresses optimization over spaces of Lipschitz functions by introducing a multiscale framework that solves discretized problems at progressively finer grids and uses interpolation to warm-start subsequent scales. Two variants, greedy and lazy, provide convergence guarantees and extend to any base algorithm with fixed-rate iterate convergence, while incorporating constraint scaling to preserve feasibility across scales. Theoretical results bound interpolation errors and relate discrete solutions to the continuous problem, showing that multiscale optimization can outperform single-scale projected gradient descent in terms of both convergence speed and computational cost. Empirical results on density demixing tasks—including synthetic and real geological data—demonstrate substantial speedups and memory savings, highlighting practical impact for large-scale, discretized continuous optimization problems.

Abstract

A multiscale optimization framework for problems over a space of Lipschitz continuous functions is developed. The method solves a coarse-grid discretization followed by linear interpolation to warm-start project gradient descent on progressively finer grids. Greedy and lazy variants are analyzed and convergence guarantees are derived that show the multiscale approach achieves provably tighter error bounds at lower computational cost than single-scale optimization. The analysis extends to any base algorithm with iterate convergence at a fixed rate. Constraint modification techniques preserve feasibility across scales. Numerical experiments on probability density estimation problems, including geological data, demonstrate speedups of an order of magnitude or better.

Multiple Scale Methods For Optimization Of Discretized Continuous Functions

TL;DR

This work addresses optimization over spaces of Lipschitz functions by introducing a multiscale framework that solves discretized problems at progressively finer grids and uses interpolation to warm-start subsequent scales. Two variants, greedy and lazy, provide convergence guarantees and extend to any base algorithm with fixed-rate iterate convergence, while incorporating constraint scaling to preserve feasibility across scales. Theoretical results bound interpolation errors and relate discrete solutions to the continuous problem, showing that multiscale optimization can outperform single-scale projected gradient descent in terms of both convergence speed and computational cost. Empirical results on density demixing tasks—including synthetic and real geological data—demonstrate substantial speedups and memory savings, highlighting practical impact for large-scale, discretized continuous optimization problems.

Abstract

A multiscale optimization framework for problems over a space of Lipschitz continuous functions is developed. The method solves a coarse-grid discretization followed by linear interpolation to warm-start project gradient descent on progressively finer grids. Greedy and lazy variants are analyzed and convergence guarantees are derived that show the multiscale approach achieves provably tighter error bounds at lower computational cost than single-scale optimization. The analysis extends to any base algorithm with iterate convergence at a fixed rate. Constraint modification techniques preserve feasibility across scales. Numerical experiments on probability density estimation problems, including geological data, demonstrate speedups of an order of magnitude or better.

Paper Structure

This paper contains 22 sections, 23 theorems, 77 equations, 11 figures, 3 algorithms.

Key Result

Lemma 6

Projected gradient descent with stepsize $\alpha$ has iterate convergence with rate This is minimized at a stepsize of $\alpha = {2}/({\mathcal{S}_{\tilde{\mathcal{L}}}+\mu_{\tilde{\mathcal{L}}}})$, where we obtain the iterate convergence for a condition number of $c=\mathcal{S}_{\tilde{\mathcal{L}}} /\mu_{\tilde{\mathcal{L}}}$.

Figures (11)

  • Figure 1: Comparison of multiscale vs single-scale approach for the motivating example from \ref{['sec-motivating-example']}. Dots represent the median total time in milliseconds over $100$ trials; shaded regions represent the $5$th and $95$th percentile times. The multiscale approach improves algorithm speed as problem size grows. Details are provided in \ref{['sec-motivating-example-numerics']}.
  • Figure 1: Example for \ref{['lem-lipschitz-interpolation']} with the function $x=f(t) = t - \cos(3\pi t)$ on the interval $[\ell, u]=[0,1]$. The function $f$ must lie within the dashed parallelogram since it is $L_{f}=1+3\pi$ Lipschitz. \ref{['lem-lipschitz-interpolation']} uses the parallelogram constraint to bound the distance between $f$ and the linear interpolation $\lambda_1 f(a) + \lambda_2 f(b)$. \ref{['lem-lipschitz-interpolation-tightness']} shows that the bound in \ref{['eq-lipschitz-interpolation-bound']} is tight for any given $\lambda_1,\lambda_2\ge0$ with $\lambda_1+\lambda_2=1$.
  • Figure 1: (Left) Median time until convergence at the finest scale $s=1$ as a function of number of fixed iterations $K_s$ at coarser scales $s=S,S-1,\dots,2$. (Right) Median number of iterations $K_1$ at the finest scale $s=1$. Shaded ribbon shows the bottom and top quartile times (left) and iterations (right).
  • Figure 2: Typical progression of iterates $x_s^k$ using single scale (top row) and multiscale (bottom row) approachs for the motivating example from \ref{['sec-motivating-example']}. The clean ground truth is plotted behind in the thick dashed lined.
  • Figure 2: Illustration of \ref{['lem-exact-interpolation']} (top) and \ref{['lem-inexact-interpolation']} (bottom) for the function $f(t)=72t^3-80t^2+10t+1$. Both lemmas bound the error between the true fine-scale discretization $x^*_s$ (orange circles) and the linear interpolation $x_s=\underline{x}_{s+1}$ (green squares) obtained from a coarser-scale discretization. The top figure shows \ref{['eq-lem-exact-interpolation']}, where $x_{s+1}=x^*_{s+1}$ is exact. The bottom figure shows \ref{['eq-lem-inexact-interpolation']}, where $x_{s+1}=x^*_{s+1}+\delta$ contains error.
  • ...and 6 more figures

Theorems & Definitions (49)

  • Definition 1: Dyadic Coarsening
  • Definition 2: Midpoint Linear Interpolation
  • Definition 3: Vector of Free Variables
  • Definition 4: Piecewise Linear Approximation
  • Remark 1
  • Definition 5: Q-Linear Iterate Convergence
  • Lemma 6: Descent Lemma nesterov-smooth-2018
  • Proposition 1: Single Linear Constraint Scaling
  • Proof 1
  • Corollary 2: Matrix Linear Constraint Scaling
  • ...and 39 more