Table of Contents
Fetching ...

Conditional expectation using compactification operators

Suddhasattwa Das

TL;DR

The paper tackles the problem of estimating conditional expectations in a product-space setting, unifying denoising, least-squares estimation, and manifold learning through an operator-theoretic, kernel-based framework. It develops a compactification approach in an RKHS, recasting conditional-expectation estimation as a regularized linear inverse problem and proving convergence for data-driven approximations via measures $\alpha,\nu$ approximating $\mu,\mu_X$. The main theoretical contributions include Theorem 1 (convergence of the LS solution to a smoothed conditional expectation) and its corollaries, along with Algorithm 1 for practical computation and a convergence guarantee (Theorem 2) for data-driven datasets. The proposed method yields a robust, scalable, and convergent tool for conditional expectation estimation with real-world applications to denoising and principal-curve problems.

Abstract

The separate tasks of denoising, least squares expectation, and manifold learning can often be posed in a common setting of finding the conditional expectations arising from a product of two random variables. This paper focuses on this more general problem and describes an operator theoretic approach to estimating the conditional expectation. Kernel integral operators are used as a compactification tool, to set up the estimation problem as a linear inverse problem in a reproducing kernel Hilbert space. This equation is shown to have solutions that allow numerical approximation, thus guaranteeing the convergence of data-driven implementations. The overall technique is easy to implement, and their successful application to some real-world problems are also shown.

Conditional expectation using compactification operators

TL;DR

The paper tackles the problem of estimating conditional expectations in a product-space setting, unifying denoising, least-squares estimation, and manifold learning through an operator-theoretic, kernel-based framework. It develops a compactification approach in an RKHS, recasting conditional-expectation estimation as a regularized linear inverse problem and proving convergence for data-driven approximations via measures approximating . The main theoretical contributions include Theorem 1 (convergence of the LS solution to a smoothed conditional expectation) and its corollaries, along with Algorithm 1 for practical computation and a convergence guarantee (Theorem 2) for data-driven datasets. The proposed method yields a robust, scalable, and convergent tool for conditional expectation estimation with real-world applications to denoising and principal-curve problems.

Abstract

The separate tasks of denoising, least squares expectation, and manifold learning can often be posed in a common setting of finding the conditional expectations arising from a product of two random variables. This paper focuses on this more general problem and describes an operator theoretic approach to estimating the conditional expectation. Kernel integral operators are used as a compactification tool, to set up the estimation problem as a linear inverse problem in a reproducing kernel Hilbert space. This equation is shown to have solutions that allow numerical approximation, thus guaranteeing the convergence of data-driven implementations. The overall technique is easy to implement, and their successful application to some real-world problems are also shown.
Paper Structure (32 sections, 9 theorems, 58 equations, 2 figures)

This paper contains 32 sections, 9 theorems, 58 equations, 2 figures.

Key Result

Lemma 2.1

Suppose Assumptions A:1 holds, and $f$ be a function in $C(X; C(Y))$. Further suppose that there is a probability measure $\mu_X\in \mathop{\mathrm{Prob}}\nolimits(X)$, and a continuous map $m : \mathop{\mathrm{supp}}\nolimits(\mu_X) \to \mathop{\mathrm{Prob}}\nolimits(Y)$. This leads to a probabili Then the conditional expectation eqn:def:Ex_f can be realized as a function in $C \left( \mathop{\

Figures (2)

  • Figure 1: Denoising a monochromatic image. Such an image can be expressed as a continuous function of x--y coordinates. The mathematical formulation is the problem is described in Section \ref{['sec:img_denoise']}. The test-image shown here is described by \ref{['eqn:img_denoise']}. The parameter $\kappa$ is an index of the $C^1$ norm of the function. The first row shows that Algorithm \ref{['algo:1']} performs reasonably well for $\kappa=2$ on a $50\times 50$ pixel image, but the performance deteriorates when $\kappa=2$. The third row shows a much improved result when the image gets more detailed with an increased size of $75\times 75$.
  • Figure 2: Principal curve estimation. Section \ref{['sec:principal']} presents an example of a principal curve problem, from data-points scattered around a "true" or "principal" curev. Equation \ref{['eqn:elctrc2']} is a realization of Assumptions \ref{['A:1']} and \ref{['A:2']}, and presents a simplified view of electrostatic charge distribution along a wire. We assume that the function $\lambda$ takes the form in \ref{['eqn:elctrc1']}. The left panels above show the results of applying Algorithm \ref{['algo:1']} to data equidistributed with respect to this distribution, to recover the conditional expectation as a function over $X=[0,1]$. The results show a close match with the true mean, which is simply the curve $\lambda$. The results also visibly improve as the number of samples are increased. The right panel shows a repeated use of Algorithm \ref{['algo:1']} to reconstruct the variance as a function over $X=[0,1]$. Again, the results show a strong match with the true function, which is $\rho$.

Theorems & Definitions (9)

  • Lemma 2.1
  • Theorem 1
  • Lemma 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Lemma 3.4
  • Theorem 2
  • Lemma 6.1
  • Lemma 6.2