Semi-supervised Fréchet Regression
Rui Qiu, Zhou Yu, Zhenhua Lin
TL;DR
This work tackles regression where the response lies in a non-Euclidean metric space by leveraging abundant unlabeled feature data. It introduces two semi-supervised Fréchet regression estimators, NW and kNN variants, built on graph-based geodesic distances to exploit a low-dimensional manifold structure. The authors establish nonasymptotic convergence rates that adapt to the intrinsic dimension $q$ of the manifold, achieving minimax-optimal rates of order $n^{-\frac{2\beta}{2\beta+q}}$ in the Hölder regime, and demonstrate substantial empirical gains over purely supervised methods in simulations and a real-face dataset. The results justify using unlabeled data to learn the manifold geometry and improve non-Euclidean prediction accuracy, with practical impact for distributional data, SPD matrices, and spherical targets. The study also discusses limitations and future extensions, including online semi-supervised learning and potential deep-learning integrations.
Abstract
This paper explores the field of semi-supervised Fréchet regression, driven by the significant costs associated with obtaining non-Euclidean labels. Methodologically, we propose two novel methods: semi-supervised NW Fréchet regression and semi-supervised kNN Fréchet regression, both based on graph distance acquired from all feature instances. These methods extend the scope of existing semi-supervised Euclidean regression methods. We establish their convergence rates with limited labeled data and large amounts of unlabeled data, taking into account the low-dimensional manifold structure of the feature space. Through comprehensive simulations across diverse settings and applications to real data, we demonstrate the superior performance of our methods over their supervised counterparts. This study addresses existing research gaps and paves the way for further exploration and advancements in the field of semi-supervised Fréchet regression.
