Table of Contents
Fetching ...

Semi-Supervised Learning via Cross-Prediction-Powered Inference for Wireless Systems

Houssem Sifaou, Osvaldo Simeone

TL;DR

This work investigates how to leverage the synthetic labels produced by an ML model, while accounting for the inherent bias concerning true labels, and proposes a tuned CPPI method to guarantee the best performance among all benchmark schemes.

Abstract

In many wireless application scenarios, acquiring labeled data can be prohibitively costly, requiring complex optimization processes or measurement campaigns. Semi-supervised learning leverages unlabeled samples to augment the available dataset by assigning synthetic labels obtained via machine learning (ML)-based predictions. However, treating the synthetic labels as true labels may yield worse-performing models as compared to models trained using only labeled data. Inspired by the recently developed prediction-powered inference (PPI) framework, this work investigates how to leverage the synthetic labels produced by an ML model, while accounting for the inherent bias concerning true labels. To this end, we first review PPI and its recent extensions, namely tuned PPI and cross-prediction-powered inference (CPPI). Then, we introduce two novel variants of PPI. The first, referred to as tuned CPPI, provides CPPI with an additional degree of freedom in adapting to the quality of the ML-based labels. The second, meta-CPPI (MCPPI), extends tuned CPPI via the joint optimization of the ML labeling models and of the parameters of interest. Finally, we showcase two applications of PPI-based techniques in wireless systems, namely beam alignment based on channel knowledge maps in millimeter-wave systems and received signal strength information-based indoor localization. Simulation results show the advantages of PPI-based techniques over conventional approaches that rely solely on labeled data or that apply standard pseudo-labeling strategies from semi-supervised learning. Furthermore, the proposed tuned CPPI method is observed to guarantee the best performance among all benchmark schemes, especially in the regime of limited labeled data.

Semi-Supervised Learning via Cross-Prediction-Powered Inference for Wireless Systems

TL;DR

This work investigates how to leverage the synthetic labels produced by an ML model, while accounting for the inherent bias concerning true labels, and proposes a tuned CPPI method to guarantee the best performance among all benchmark schemes.

Abstract

In many wireless application scenarios, acquiring labeled data can be prohibitively costly, requiring complex optimization processes or measurement campaigns. Semi-supervised learning leverages unlabeled samples to augment the available dataset by assigning synthetic labels obtained via machine learning (ML)-based predictions. However, treating the synthetic labels as true labels may yield worse-performing models as compared to models trained using only labeled data. Inspired by the recently developed prediction-powered inference (PPI) framework, this work investigates how to leverage the synthetic labels produced by an ML model, while accounting for the inherent bias concerning true labels. To this end, we first review PPI and its recent extensions, namely tuned PPI and cross-prediction-powered inference (CPPI). Then, we introduce two novel variants of PPI. The first, referred to as tuned CPPI, provides CPPI with an additional degree of freedom in adapting to the quality of the ML-based labels. The second, meta-CPPI (MCPPI), extends tuned CPPI via the joint optimization of the ML labeling models and of the parameters of interest. Finally, we showcase two applications of PPI-based techniques in wireless systems, namely beam alignment based on channel knowledge maps in millimeter-wave systems and received signal strength information-based indoor localization. Simulation results show the advantages of PPI-based techniques over conventional approaches that rely solely on labeled data or that apply standard pseudo-labeling strategies from semi-supervised learning. Furthermore, the proposed tuned CPPI method is observed to guarantee the best performance among all benchmark schemes, especially in the regime of limited labeled data.
Paper Structure (30 sections, 1 theorem, 56 equations, 10 figures, 2 algorithms)

This paper contains 30 sections, 1 theorem, 56 equations, 10 figures, 2 algorithms.

Key Result

Theorem 1

For $n\to\infty$ with $n/N=r$, assume that the estimate $\hat{\lambda}_n$ converges to some value $\lambda$, i.e., $\hat{\lambda}_n \overset{P}\longrightarrow \lambda$, and that the corresponding parameter $\theta^{\rm CP}_{\hat{\lambda}_n}$ in tcp converges to the optimal value $\theta^\star$, i.e with covariance matrix

Figures (10)

  • Figure 1: Illustration of the original PPI scheme angelopoulos2023prediction: Using the labeled data and a pre-trained model $f(\cdot)$, the rectifier term $\Delta^{\rm PP}(\theta)=\frac{1}{n}\sum_{i=1}^n \left[\ell_\theta(X_i,f(X_i))-\ell_\theta(X_i,Y_i)\right]$ is evaluated to estimate the prediction bias of the model $f(\cdot)$. This term is subtracted from the unlabeled loss $\frac{1}{N}\sum_{i=1}^N \ell_\theta(\tilde{X}_i,f(\tilde{X}_i))$, obtaining the PPI loss $L^{\rm PP}(\theta)$ in \ref{['loss_ppi_intro']}.
  • Figure 2: Illustration of the CPPI scheme zrnic2023cross: The labeled data is divided into $K$ folds $\mathcal{D}^{(1)},\cdots,\mathcal{D}^{(K)}$, and $K$ prediction models are trained, with each model $f^{(k)}(\cdot)$ being trained on all labeled data except for fold $\mathcal{D}^{(k)}$. Using the $K$ trained models, a rectifier $\Delta^{\rm CP}(\theta)=\frac{1}{n}\sum_{k=1}^K \sum_{i\in\mathcal{D}^{(k)}}\left[\ell_\theta(X_i,f^{(k)}(X_i)) -\ell_\theta(X_i,Y_i)\right]$ is evaluated that estimates the prediction bias of models $\{f^{(k)}(\cdot)\}_{k=1}^K$. This term is subtracted from the unlabeled loss $\frac{1}{KN}\sum_{k=1}^K\sum_{i=1}^N \ell_\theta(\tilde{X}_i,f^{(k)}(\tilde{X}_i))$, obtaining the CPPI loss $L^{\rm CP}(\theta)$ in \ref{['cppi_loss_intro']}.
  • Figure 3: Illustration of the two application scenarios: (a) beam alignment in mmWave communication systems, in which the optimal beam index $Y$ is determined based on the device location $X$; and (b) an indoor localization system based on RSSI, in which the position of the device $Y$ is predicted based on RSSI measurements $X$ received from access points. In both cases, a pre-trained ML model $f(.)$ can be used to augment the labeled datasets, with a channel knowledge map (CKM) adopted for beam alignment.
  • Figure 4: Mean squared error as a function of the size of the labeled dataset for the problem of men estimation under the synthetic data-generation model \ref{['data_gen_model']}. The results are averaged over $300$ trials.
  • Figure 5: Mean squared error as a function of the size of the labeled dataset for the problem of linear regression under the synthetic data-generation model \ref{['data_gen_model']}. The results are averaged over $300$ trials.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof