Learning Cellular Network Connection Quality with Conformal
Hanyang Jiang, Elizabeth Belding, Ellen Zegure, Yao Xie
TL;DR
The paper tackles uncertainty in cellular network speeds by introducing Ensemble Spatial Conformal Prediction (ESCP), a framework that combines a self-tuning bandwidth kernel regression (STBKR) for spatially adaptive speed estimation with conformal prediction to produce predictive intervals. The STBKR uses a bandwidth $h(x)=cR_k(x)^2$ where $R_k(x)$ is the distance to the $k$-th nearest neighbor and $c$ is selected by cross-validation, addressing spatial data imbalance; ESCP then leverages bootstrap ensembles and quantile regression (QRF) on local residuals to generate per-location prediction intervals with finite-sample coverage. Evaluations on large Ookla datasets for Georgia and New Mexico show valid coverage at the desired level while yielding tighter intervals than baseline conformal methods; the resulting uncertainty maps identify urban areas with high variability and rural regions with limited data, guiding targeted data collection. The practical impact lies in producing reliable connection-quality maps, informing infrastructure planning, digital inclusion efforts, and active data collection strategies to improve prediction accuracy and resource allocation.
Abstract
In this paper, we address the problem of uncertainty quantification for cellular network speed. It is a well-known fact that the actual internet speed experienced by a mobile phone can fluctuate significantly, even when remaining in a single location. This high degree of variability underscores that mere point estimation of network speed is insufficient. Rather, it is advantageous to establish a prediction interval that can encompass the expected range of speed variations. In order to build an accurate network estimation map, numerous mobile data need to be collected at different locations. Currently, public datasets rely on users to upload data through apps. Although massive data has been collected, the datasets suffer from significant noise due to the nature of cellular networks and various other factors. Additionally, the uneven distribution of population density affects the spatial consistency of data collection, leading to substantial uncertainty in the network quality maps derived from this data. We focus our analysis on large-scale internet-quality datasets provided by Ookla to construct an estimated map of connection quality. To improve the reliability of this map, we introduce a novel conformal prediction technique to build an uncertainty map. We identify regions with heightened uncertainty to prioritize targeted, manual data collection. In addition, the uncertainty map quantifies how reliable the prediction is in different areas. Our method also leads to a sampling strategy that guides researchers to selectively gather high-quality data that best complement the current dataset to improve the overall accuracy of the prediction model.
