A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

Andrzej Ruszczyński; Shangzhe Yang

A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

Andrzej Ruszczyński, Shangzhe Yang

TL;DR

The paper tackles conditional stochastic optimization (CSO) where the objective depends on an inner conditional expectation, posing nonconvex and nonsmooth challenges. It introduces a single time-scale stochastic method that jointly updates the decision parameter $\beta$ and an inner-model parameter $\theta$ to track $F(X,\beta)=\mathbb{E}[f(X,Y,\beta)\mid X]$ via $\Psi(X,\theta)$, with a tracking loss $Q(\beta,\theta)=\tfrac{1}{2}\mathbb{E}[\|F(X,\beta)-\Psi(X,\theta)\|^2]$ and a stochastic Łojasiewicz condition. The update directions are $\tilde d_\beta^k=-f_\beta^k \nabla g(\Psi^k)$ and $\tilde d_\theta^k=\gamma\;\Psi_\theta^k (f^k-\Psi^k)$, followed by projections onto compact sets; convergence is proved using a Lyapunov function $w(\beta,\theta)=G(\beta)+\alpha\Delta^\lambda(\beta,\theta)$ and a differential inclusion framework, yielding almost-sure convergence to a stationary set $\mathbb{Z}^*$. A calibrated numerical example in a linear MDP demonstrates tracking accuracy and outer-objective convergence. Overall, the approach enables real-time, unbiased-direction learning for CSO in nonconvex nonsmooth settings, with potential impact in reinforcement learning and contextual optimization.

Abstract

We consider stochastic optimization problems involving an expected value of a nonlinear function of a base random vector and a conditional expectation of another function depending on the base random vector, a dependent random vector, and the decision variables. We call such problems conditional stochastic optimization problems. They arise in many applications, such as uplift modeling, reinforcement learning, and contextual optimization. We propose a specialized single time-scale stochastic method for nonconvex constrained conditional stochastic optimization problems with a Lipschitz smooth outer function and a generalized differentiable inner function. In the method, we approximate the inner conditional expectation with a rich parametric model whose mean squared error satisfies a stochastic version of a Łojasiewicz condition. The model is used by an inner learning algorithm. The main feature of our approach is that unbiased stochastic estimates of the directions used by the method can be generated with one observation from the joint distribution per iteration, which makes it applicable to real-time learning. The directions, however, are not gradients or subgradients of any overall objective function. We prove the convergence of the method with probability one, using the method of differential inclusions and a specially designed Lyapunov function, involving a stochastic generalization of the Bregman distance. Finally, a numerical illustration demonstrates the viability of our approach.

A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

TL;DR

and an inner-model parameter

to track

via

, with a tracking loss

and a stochastic Łojasiewicz condition. The update directions are

and

, followed by projections onto compact sets; convergence is proved using a Lyapunov function

and a differential inclusion framework, yielding almost-sure convergence to a stationary set

. A calibrated numerical example in a linear MDP demonstrates tracking accuracy and outer-objective convergence. Overall, the approach enables real-time, unbiased-direction learning for CSO in nonconvex nonsmooth settings, with potential impact in reinforcement learning and contextual optimization.

Abstract

Paper Structure (7 sections, 16 theorems, 99 equations, 1 figure)

This paper contains 7 sections, 16 theorems, 99 equations, 1 figure.

Introduction
Assumptions and Basic Properties
The method
Convergence Analysis
Numerical Illustration
Conclusions and Future Research
Generalized differentiability of functions

Key Result

Lemma 2.1

Suppose the function $f(X,Y,\,\cdot\,)$ is differentiable in the generalized sense and its subgradients are bounded by an integrable function in any bounded neighborhood of any point $\beta$. Then for almost all $X$ the function $\beta \mapsto \mathbbm{E}[ f(X,Y,\beta) | X]$ is differentiable in the

Figures (1)

Figure 1: The progress of the method on the training set (the top three subgraphs) and on the test set (the bottom five subgraphs).

Theorems & Definitions (37)

Example 1.1: Reinforcement Learning
Example 1.2: Uplift Modeling
Example 1.3: Contextual Optimization
Lemma 2.1
proof
Remark 2.2
Remark 2.3
Remark 2.4
Remark 2.5
Lemma 4.1
...and 27 more

A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

TL;DR

Abstract

A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (37)