Table of Contents
Fetching ...

Reducing Contextual Stochastic Bilevel Optimization via Structured Function Approximation

Maxime Bouscary, Jiawei Zhang, Saurabh Amin

TL;DR

This work addresses Contextual Stochastic Bilevel Optimization (CSBO), where the lower-level problem depends on context and is typically intractable due to the need for conditional sampling and solving many inner problems. It proposes a reduction that parameterizes the context-dependent lower-level solution $y^\star(x,\xi)$ with an expressive basis $\Phi$, yielding a standard SBO that can be solved with joint samples from the joint distribution $\mathbb{P}_{(\xi,\eta)}$. Theoretical results show that, when the basis is sufficiently expressive and well-conditioned, the hypergradient of the reduced problem closely approximates the true hypergradient, and an $\epsilon$-stationary CSBO solution can be obtained with $\tilde{O}(\epsilon^{-3})$ complexity; Chebyshev polynomials are shown to satisfy the required conditions, enabling near-optimal rates in a broad class of problems. Empirical tests on inverse optimization and hyperparameter optimization demonstrate faster convergence, improved sample efficiency, and lower memory usage compared to partition-based CSBO baselines, validating the practical impact of the approach.

Abstract

Contextual Stochastic Bilevel Optimization (CSBO) extends standard stochastic bilevel optimization (SBO) by incorporating context-dependent lower-level problems. CSBO problems are generally intractable since existing methods require solving a distinct lower-level problem for each sampled context, resulting in prohibitive sample and computational complexity, in addition to relying on impractical conditional sampling oracles. We propose a reduction framework that approximates the lower-level solutions using expressive basis functions, thereby decoupling the lower-level dependence on context and transforming CSBO into a standard SBO problem solvable using only joint samples from the context and noise distribution. First, we show that this reduction preserves hypergradient accuracy and yields an $ε$-stationary solution to CSBO. Then, we relate the sample complexity of the reduced problem to simple metrics of the basis. This establishes sufficient criteria for a basis to yield $ε$-stationary solutions with a near-optimal complexity of $\widetilde{O}(ε^{-3})$, matching the best-known rate for standard SBO up to logarithmic factors. Moreover, we show that Chebyshev polynomials provide a concrete and efficient choice of basis that satisfies these criteria for a broad class of problems. Empirical results on inverse and hyperparameter optimization demonstrate that our approach outperforms CSBO baselines in convergence, sample efficiency, and memory usage.

Reducing Contextual Stochastic Bilevel Optimization via Structured Function Approximation

TL;DR

This work addresses Contextual Stochastic Bilevel Optimization (CSBO), where the lower-level problem depends on context and is typically intractable due to the need for conditional sampling and solving many inner problems. It proposes a reduction that parameterizes the context-dependent lower-level solution with an expressive basis , yielding a standard SBO that can be solved with joint samples from the joint distribution . Theoretical results show that, when the basis is sufficiently expressive and well-conditioned, the hypergradient of the reduced problem closely approximates the true hypergradient, and an -stationary CSBO solution can be obtained with complexity; Chebyshev polynomials are shown to satisfy the required conditions, enabling near-optimal rates in a broad class of problems. Empirical tests on inverse optimization and hyperparameter optimization demonstrate faster convergence, improved sample efficiency, and lower memory usage compared to partition-based CSBO baselines, validating the practical impact of the approach.

Abstract

Contextual Stochastic Bilevel Optimization (CSBO) extends standard stochastic bilevel optimization (SBO) by incorporating context-dependent lower-level problems. CSBO problems are generally intractable since existing methods require solving a distinct lower-level problem for each sampled context, resulting in prohibitive sample and computational complexity, in addition to relying on impractical conditional sampling oracles. We propose a reduction framework that approximates the lower-level solutions using expressive basis functions, thereby decoupling the lower-level dependence on context and transforming CSBO into a standard SBO problem solvable using only joint samples from the context and noise distribution. First, we show that this reduction preserves hypergradient accuracy and yields an -stationary solution to CSBO. Then, we relate the sample complexity of the reduced problem to simple metrics of the basis. This establishes sufficient criteria for a basis to yield -stationary solutions with a near-optimal complexity of , matching the best-known rate for standard SBO up to logarithmic factors. Moreover, we show that Chebyshev polynomials provide a concrete and efficient choice of basis that satisfies these criteria for a broad class of problems. Empirical results on inverse and hyperparameter optimization demonstrate that our approach outperforms CSBO baselines in convergence, sample efficiency, and memory usage.

Paper Structure

This paper contains 21 sections, 20 theorems, 101 equations, 2 figures.

Key Result

Proposition 4.1

Under assumption assumption:SBO, the following holds for any $x \in \mathbb{R}^{d_x}$: where $K$ is defined in Definition def:expressive_function.

Figures (2)

  • Figure 1: Loss $F(\bar{x})$ (left), lower level solution error $\Delta_y$ (center), and upper level solution error $\Delta_x$ (right) of stocBiO and our reduction framework using monomial, Fourier, and Chebyshev bases. The reference loss is $F_\text{ref} = F(x^\star)$.
  • Figure 2: Moving average of the validation (left) and training (right) losses over epochs of stocBiO and monomial, Fourier, and Chebyshev bases on hyperparameter optimization.

Theorems & Definitions (43)

  • Remark 3.2
  • Definition 3.3
  • Definition 3.4
  • Proposition 4.1
  • Proposition 4.2
  • Theorem 4.3
  • proof
  • Theorem 4.4
  • Theorem 4.5
  • Corollary 4.6
  • ...and 33 more