Table of Contents
Fetching ...

Manifold Regularization Classification Model Based On Improved Diffusion Map

Hongfu Guo, Wencheng Zou, Zeyu Zhang, Shuishan Zhang, Ruitong Wang, Jintao Zhang

TL;DR

This paper enhances the probability transition matrix of the diffusion map algorithm, enabling it to accurately depict the label propagation process on the manifold, and extends the label propagation function to the entire data manifold.

Abstract

Manifold regularization model is a semi-supervised learning model that leverages the geometric structure of a dataset, comprising a small number of labeled samples and a large number of unlabeled samples, to generate classifiers. However, the original manifold norm limits the performance of models to local regions. To address this limitation, this paper proposes an approach to improve manifold regularization based on a label propagation model. We initially enhance the probability transition matrix of the diffusion map algorithm, which can be used to estimate the Neumann heat kernel, enabling it to accurately depict the label propagation process on the manifold. Using this matrix, we establish a label propagation function on the dataset to describe the distribution of labels at different time steps. Subsequently, we extend the label propagation function to the entire data manifold. We prove that the extended label propagation function converges to a stable distribution after a sufficiently long time and can be considered as a classifier. Building upon this concept, we propose a viable improvement to the manifold regularization model and validate its superiority through experiments.

Manifold Regularization Classification Model Based On Improved Diffusion Map

TL;DR

This paper enhances the probability transition matrix of the diffusion map algorithm, enabling it to accurately depict the label propagation process on the manifold, and extends the label propagation function to the entire data manifold.

Abstract

Manifold regularization model is a semi-supervised learning model that leverages the geometric structure of a dataset, comprising a small number of labeled samples and a large number of unlabeled samples, to generate classifiers. However, the original manifold norm limits the performance of models to local regions. To address this limitation, this paper proposes an approach to improve manifold regularization based on a label propagation model. We initially enhance the probability transition matrix of the diffusion map algorithm, which can be used to estimate the Neumann heat kernel, enabling it to accurately depict the label propagation process on the manifold. Using this matrix, we establish a label propagation function on the dataset to describe the distribution of labels at different time steps. Subsequently, we extend the label propagation function to the entire data manifold. We prove that the extended label propagation function converges to a stable distribution after a sufficiently long time and can be considered as a classifier. Building upon this concept, we propose a viable improvement to the manifold regularization model and validate its superiority through experiments.
Paper Structure (19 sections, 10 theorems, 64 equations, 8 figures, 4 algorithms)

This paper contains 19 sections, 10 theorems, 64 equations, 8 figures, 4 algorithms.

Key Result

Theorem 1

$X$ is a dataset with a finite number of elements. Considering the binary classification problem, $X_{l}$ is the labeled dataset, and $X_{u}$ is the unlabeled dataset. There exists $Q(x)\in C^{\infty}(\mathcal{M})$ such that where $\mathcal{M}$ is a sub-manifold the dataset embedded.

Figures (8)

  • Figure 1: Diffusion process on a helix-shaped dataset formed by 300 points
  • Figure 2: The label propogation process based on transition matrix constructed by Euclidean distance
  • Figure 3: The label propagation process based on transition matrix constructed by geodesic distance
  • Figure 4: The label propogation process based on transition matrix constructed by geodesic distance
  • Figure 5: The performance of NHKRLS and LapRLS in twoClusters, twomoon and Ring dataset. Labeled dataset is diamond shaped and unlabeled is circular.
  • ...and 3 more figures

Theorems & Definitions (15)

  • Theorem 1: The existence of initial propagation function
  • Theorem 2
  • Definition 1: Kernel function
  • Theorem 3
  • Corollary 1
  • Lemma 1
  • proof
  • Lemma 2
  • Theorem 4
  • proof
  • ...and 5 more