Table of Contents
Fetching ...

Semi-supervised Fréchet Regression

Rui Qiu, Zhou Yu, Zhenhua Lin

TL;DR

This work tackles regression where the response lies in a non-Euclidean metric space by leveraging abundant unlabeled feature data. It introduces two semi-supervised Fréchet regression estimators, NW and kNN variants, built on graph-based geodesic distances to exploit a low-dimensional manifold structure. The authors establish nonasymptotic convergence rates that adapt to the intrinsic dimension $q$ of the manifold, achieving minimax-optimal rates of order $n^{-\frac{2\beta}{2\beta+q}}$ in the Hölder regime, and demonstrate substantial empirical gains over purely supervised methods in simulations and a real-face dataset. The results justify using unlabeled data to learn the manifold geometry and improve non-Euclidean prediction accuracy, with practical impact for distributional data, SPD matrices, and spherical targets. The study also discusses limitations and future extensions, including online semi-supervised learning and potential deep-learning integrations.

Abstract

This paper explores the field of semi-supervised Fréchet regression, driven by the significant costs associated with obtaining non-Euclidean labels. Methodologically, we propose two novel methods: semi-supervised NW Fréchet regression and semi-supervised kNN Fréchet regression, both based on graph distance acquired from all feature instances. These methods extend the scope of existing semi-supervised Euclidean regression methods. We establish their convergence rates with limited labeled data and large amounts of unlabeled data, taking into account the low-dimensional manifold structure of the feature space. Through comprehensive simulations across diverse settings and applications to real data, we demonstrate the superior performance of our methods over their supervised counterparts. This study addresses existing research gaps and paves the way for further exploration and advancements in the field of semi-supervised Fréchet regression.

Semi-supervised Fréchet Regression

TL;DR

This work tackles regression where the response lies in a non-Euclidean metric space by leveraging abundant unlabeled feature data. It introduces two semi-supervised Fréchet regression estimators, NW and kNN variants, built on graph-based geodesic distances to exploit a low-dimensional manifold structure. The authors establish nonasymptotic convergence rates that adapt to the intrinsic dimension of the manifold, achieving minimax-optimal rates of order in the Hölder regime, and demonstrate substantial empirical gains over purely supervised methods in simulations and a real-face dataset. The results justify using unlabeled data to learn the manifold geometry and improve non-Euclidean prediction accuracy, with practical impact for distributional data, SPD matrices, and spherical targets. The study also discusses limitations and future extensions, including online semi-supervised learning and potential deep-learning integrations.

Abstract

This paper explores the field of semi-supervised Fréchet regression, driven by the significant costs associated with obtaining non-Euclidean labels. Methodologically, we propose two novel methods: semi-supervised NW Fréchet regression and semi-supervised kNN Fréchet regression, both based on graph distance acquired from all feature instances. These methods extend the scope of existing semi-supervised Euclidean regression methods. We establish their convergence rates with limited labeled data and large amounts of unlabeled data, taking into account the low-dimensional manifold structure of the feature space. Through comprehensive simulations across diverse settings and applications to real data, we demonstrate the superior performance of our methods over their supervised counterparts. This study addresses existing research gaps and paves the way for further exploration and advancements in the field of semi-supervised Fréchet regression.
Paper Structure (8 sections, 2 theorems, 26 equations, 11 figures)

This paper contains 8 sections, 2 theorems, 26 equations, 11 figures.

Key Result

Theorem 3.2

Assume $X$ is supported on a $q$-dimensional submanifold $\mathcal{M}$ embedded in $\mathcal{R}^p$$(q \leq p)$. Moreover, assume (A1)--(A7). Given $\epsilon >0, \lambda \in (0,1)$ and $\tau, c_0 > 0$ defined in Lemma 6 of supplementary materials. Then for enough large $m$ satisfying $m^{(\lambda-1)/

Figures (11)

  • Figure 1: Sparse $r$-graph constructed by $300$ unlabeled points, $30$ labeled points and one new point from $[0, 1]^2$ with $r=0.1$.
  • Figure 2: A typical Swiss roll in three-dimensional Euclidean space with $1000$ sample points.
  • Figure 3: AMSE of different methods for setting I, II with $100$ labeled points under snr$=2, 4$ when the dimension of $X$ is equal to $3$.
  • Figure 4: AMSE of different methods for setting III, IV with $100$ labeled points when the dimension of $X$ is equal to $3$.
  • Figure 5: AMSE of different methods for setting I, III with $100$ labeled points when the dimension of $X$ is equal to $3$ and the components of $U$ are correlated.
  • ...and 6 more figures

Theorems & Definitions (6)

  • Remark 2.1
  • Remark 2.2
  • Definition 3.1
  • Theorem 3.2: Semi-supervised Fréchet regression
  • Remark 3.3
  • Theorem 3.4: Supervised Fréchet regression