Table of Contents
Fetching ...

A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

Andrzej Ruszczyński, Shangzhe Yang

TL;DR

The paper tackles conditional stochastic optimization (CSO) where the objective depends on an inner conditional expectation, posing nonconvex and nonsmooth challenges. It introduces a single time-scale stochastic method that jointly updates the decision parameter $\beta$ and an inner-model parameter $\theta$ to track $F(X,\beta)=\mathbb{E}[f(X,Y,\beta)\mid X]$ via $\Psi(X,\theta)$, with a tracking loss $Q(\beta,\theta)=\tfrac{1}{2}\mathbb{E}[\|F(X,\beta)-\Psi(X,\theta)\|^2]$ and a stochastic Łojasiewicz condition. The update directions are $\tilde d_\beta^k=-f_\beta^k \nabla g(\Psi^k)$ and $\tilde d_\theta^k=\gamma\;\Psi_\theta^k (f^k-\Psi^k)$, followed by projections onto compact sets; convergence is proved using a Lyapunov function $w(\beta,\theta)=G(\beta)+\alpha\Delta^\lambda(\beta,\theta)$ and a differential inclusion framework, yielding almost-sure convergence to a stationary set $\mathbb{Z}^*$. A calibrated numerical example in a linear MDP demonstrates tracking accuracy and outer-objective convergence. Overall, the approach enables real-time, unbiased-direction learning for CSO in nonconvex nonsmooth settings, with potential impact in reinforcement learning and contextual optimization.

Abstract

We consider stochastic optimization problems involving an expected value of a nonlinear function of a base random vector and a conditional expectation of another function depending on the base random vector, a dependent random vector, and the decision variables. We call such problems conditional stochastic optimization problems. They arise in many applications, such as uplift modeling, reinforcement learning, and contextual optimization. We propose a specialized single time-scale stochastic method for nonconvex constrained conditional stochastic optimization problems with a Lipschitz smooth outer function and a generalized differentiable inner function. In the method, we approximate the inner conditional expectation with a rich parametric model whose mean squared error satisfies a stochastic version of a Łojasiewicz condition. The model is used by an inner learning algorithm. The main feature of our approach is that unbiased stochastic estimates of the directions used by the method can be generated with one observation from the joint distribution per iteration, which makes it applicable to real-time learning. The directions, however, are not gradients or subgradients of any overall objective function. We prove the convergence of the method with probability one, using the method of differential inclusions and a specially designed Lyapunov function, involving a stochastic generalization of the Bregman distance. Finally, a numerical illustration demonstrates the viability of our approach.

A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

TL;DR

The paper tackles conditional stochastic optimization (CSO) where the objective depends on an inner conditional expectation, posing nonconvex and nonsmooth challenges. It introduces a single time-scale stochastic method that jointly updates the decision parameter and an inner-model parameter to track via , with a tracking loss and a stochastic Łojasiewicz condition. The update directions are and , followed by projections onto compact sets; convergence is proved using a Lyapunov function and a differential inclusion framework, yielding almost-sure convergence to a stationary set . A calibrated numerical example in a linear MDP demonstrates tracking accuracy and outer-objective convergence. Overall, the approach enables real-time, unbiased-direction learning for CSO in nonconvex nonsmooth settings, with potential impact in reinforcement learning and contextual optimization.

Abstract

We consider stochastic optimization problems involving an expected value of a nonlinear function of a base random vector and a conditional expectation of another function depending on the base random vector, a dependent random vector, and the decision variables. We call such problems conditional stochastic optimization problems. They arise in many applications, such as uplift modeling, reinforcement learning, and contextual optimization. We propose a specialized single time-scale stochastic method for nonconvex constrained conditional stochastic optimization problems with a Lipschitz smooth outer function and a generalized differentiable inner function. In the method, we approximate the inner conditional expectation with a rich parametric model whose mean squared error satisfies a stochastic version of a Łojasiewicz condition. The model is used by an inner learning algorithm. The main feature of our approach is that unbiased stochastic estimates of the directions used by the method can be generated with one observation from the joint distribution per iteration, which makes it applicable to real-time learning. The directions, however, are not gradients or subgradients of any overall objective function. We prove the convergence of the method with probability one, using the method of differential inclusions and a specially designed Lyapunov function, involving a stochastic generalization of the Bregman distance. Finally, a numerical illustration demonstrates the viability of our approach.
Paper Structure (7 sections, 16 theorems, 99 equations, 1 figure)

This paper contains 7 sections, 16 theorems, 99 equations, 1 figure.

Key Result

Lemma 2.1

Suppose the function $f(X,Y,\,\cdot\,)$ is differentiable in the generalized sense and its subgradients are bounded by an integrable function in any bounded neighborhood of any point $\beta$. Then for almost all $X$ the function $\beta \mapsto \mathbbm{E}[ f(X,Y,\beta) | X]$ is differentiable in the

Figures (1)

  • Figure 1: The progress of the method on the training set (the top three subgraphs) and on the test set (the bottom five subgraphs).

Theorems & Definitions (37)

  • Example 1.1: Reinforcement Learning
  • Example 1.2: Uplift Modeling
  • Example 1.3: Contextual Optimization
  • Lemma 2.1
  • proof
  • Remark 2.2
  • Remark 2.3
  • Remark 2.4
  • Remark 2.5
  • Lemma 4.1
  • ...and 27 more