Table of Contents
Fetching ...

Kernel Mean Embedding Topology: Weak and Strong Forms for Stochastic Kernels and Implications for Model Learning

Naci Saldi, Serdar Yuksel

TL;DR

This work introduces Kernel Mean Embedding Topology (KMET) for stochastic kernels, establishing weak and strong forms that connect to classical topologies like Young-narrow and $w^*$-topology via equivalence results. The weak formulation leverages RKHS-based kernel mean embeddings and the Maximum Mean Discrepancy (MMD) to define a Hilbert-space-compatible topology on kernel distributions, while the strong form uses a relative strong-norm topology to address robustness and learning in model-based settings. The authors prove that the three weak topologies are topologically equivalent on stochastic kernels under certain conditions, though closure/compactness properties differ, and show that the strong form dominates the weak form under appropriate regularity. Applications focus on robustness and learning in Markov Decision Processes (MDPs), showing convergence of optimal values and policies when perturbed kernels converge under either KMET, and discuss implications for empirical model learning and policy continuity. The results highlight KMET as a versatile framework for analyzing optimality, approximations, and robustness across stochastic-control contexts, with the RKHS-based structure enabling data-driven kernel approximations and simulations.

Abstract

We introduce a novel topology, called Kernel Mean Embedding Topology, for stochastic kernels, in a weak and strong form. This topology, defined on the spaces of Bochner integrable functions from a signal space to a space of probability measures endowed with a Hilbert space structure, allows for a versatile formulation. This construction allows one to obtain both a strong and weak formulation. (i) For its weak formulation, we highlight the utility on relaxed policy spaces, and investigate connections with the Young narrow topology and Borkar (or \( w^* \))-topology, and establish equivalence properties. We report that, while both the \( w^* \)-topology and kernel mean embedding topology are relatively compact, they are not closed. Conversely, while the Young narrow topology is closed, it lacks relative compactness. (ii) We show that the strong form provides an appropriate formulation for placing topologies on spaces of models characterized by stochastic kernels with explicit robustness and learning theoretic implications on optimal stochastic control under discounted or average cost criteria. (iii) We thus show that this topology possesses several properties making it ideal to study optimality and approximations (under the weak formulation) and robustness (under the strong formulation) for many applications.

Kernel Mean Embedding Topology: Weak and Strong Forms for Stochastic Kernels and Implications for Model Learning

TL;DR

This work introduces Kernel Mean Embedding Topology (KMET) for stochastic kernels, establishing weak and strong forms that connect to classical topologies like Young-narrow and -topology via equivalence results. The weak formulation leverages RKHS-based kernel mean embeddings and the Maximum Mean Discrepancy (MMD) to define a Hilbert-space-compatible topology on kernel distributions, while the strong form uses a relative strong-norm topology to address robustness and learning in model-based settings. The authors prove that the three weak topologies are topologically equivalent on stochastic kernels under certain conditions, though closure/compactness properties differ, and show that the strong form dominates the weak form under appropriate regularity. Applications focus on robustness and learning in Markov Decision Processes (MDPs), showing convergence of optimal values and policies when perturbed kernels converge under either KMET, and discuss implications for empirical model learning and policy continuity. The results highlight KMET as a versatile framework for analyzing optimality, approximations, and robustness across stochastic-control contexts, with the RKHS-based structure enabling data-driven kernel approximations and simulations.

Abstract

We introduce a novel topology, called Kernel Mean Embedding Topology, for stochastic kernels, in a weak and strong form. This topology, defined on the spaces of Bochner integrable functions from a signal space to a space of probability measures endowed with a Hilbert space structure, allows for a versatile formulation. This construction allows one to obtain both a strong and weak formulation. (i) For its weak formulation, we highlight the utility on relaxed policy spaces, and investigate connections with the Young narrow topology and Borkar (or )-topology, and establish equivalence properties. We report that, while both the -topology and kernel mean embedding topology are relatively compact, they are not closed. Conversely, while the Young narrow topology is closed, it lacks relative compactness. (ii) We show that the strong form provides an appropriate formulation for placing topologies on spaces of models characterized by stochastic kernels with explicit robustness and learning theoretic implications on optimal stochastic control under discounted or average cost criteria. (iii) We thus show that this topology possesses several properties making it ideal to study optimality and approximations (under the weak formulation) and robustness (under the strong formulation) for many applications.

Paper Structure

This paper contains 20 sections, 17 theorems, 108 equations, 2 figures.

Key Result

Theorem 1.1

Let $\{\gamma_{\lambda}\}$ be a sequence of stochastic kernels from one Borel space to another (locally compact) Borel space and let $\gamma$ also be a stochastic kernel. Then, the following are equivalent:

Figures (2)

  • Figure 1: Hierarchy of topologies on stochastic kernels and relations with the topologies introduced in our paper: The strong kernel mean embedding topology and weak kernel mean embedding topology: (a) The Young-narrow and Borkar topologies are known in the literature, and their equivalence on the set of stochastic kernels is studied in yuksel2023borkar. (b) In this paper, we introduce the weak kernel mean embedding topology and demonstrate its equivalence to the Young-narrow and Borkar topologies (see Theorem \ref{['equivalence']}). (c) We also introduce a strong version of the kernel mean embedding topology, which has been used to establish empirical consistency results for approximating the conditional mean embeddings of stochastic kernels TaBa24MaPe23. It clearly implies the weak version of the kernel mean embedding topology. (d') As established in Theorem \ref{['strongStrongerthanWeakThm']}, convergence in the weak kernel mean embedding topology does not imply convergence in the strong version. (d) However, as shown in Theorem \ref{['WeakImpliesStrong']}, under the assumption of bounded and equicontinuous densities for the probability measures in the stochastic kernels, we can establish the implication from the weak to the strong version of the kernel topology. (e) Using conditional mean embedding framework SoHuSmFu09, one can view stochastic kernels as operators between RKHS and $L_2$-space KlScSu20 with operator norm topology. MaPe23 shows the strong kernel mean embedding topology for the case with $q=2$ dominates this topology. (f) Alternatively, stochastic kernels can be viewed as Hilbert-Schmidt operators between an RKHS and an $L_2$-space, where the Hilbert-Schmidt norm topology, which is stronger than operator norm topology, is equivalent to the strong kernel mean embedding topology for the case with $q=2$MaPe23.
  • Figure 2: Functions $f_1$, $f_2$, and $f_3$.

Theorems & Definitions (46)

  • Theorem 1.1
  • Theorem 2.1
  • Remark 1
  • Remark 2
  • Definition 1: Maximum Mean Discrepancy
  • Lemma 3.1
  • proof
  • Lemma 3.2
  • proof
  • Lemma 3.3
  • ...and 36 more