Environmental Feature Engineering and Statistical Validation for ML-Based Path Loss Prediction

Jonathan Ethier; Mathieu Chateauvert; Ryan G. Dempsey; Alexis Bose

Environmental Feature Engineering and Statistical Validation for ML-Based Path Loss Prediction

Jonathan Ethier, Mathieu Chateauvert, Ryan G. Dempsey, Alexis Bose

TL;DR

This work addresses accurate path loss prediction by leveraging GIS-derived environment information, specifically DSM-based obstructions, to enrich a compact neural network with eight scalar features defined along the direct Tx–Rx path. The approach uses two hidden layers with 64 neurons each, dropout, and $L_2$-style regularization, and validates generalization through six geographically distinct UK holdouts and an intercontinental blind test on Canadian data, achieving RMSE as low as $6.74$ dB with $R^2=0.88$. The study demonstrates robustness to initialization and train/validation splits, and shows consistent improvements over prior baselines (e.g., P.1812) while avoiding overfitting despite feature richness. The GIS-based feature engineering enables scalable, environment-aware path loss modeling across $0.5$–$6$ GHz without terrain-type labels, with future work targeting diffraction effects and higher-frequency regimes via additional obstruction properties.

Abstract

Wireless communications rely on path loss modeling, which is most effective when it includes the physical details of the propagation environment. Acquiring this data has historically been challenging, but geographic information systems data is becoming increasingly available with higher resolution and accuracy. Access to such details enables propagation models to more accurately predict coverage and account for interference in wireless deployments. Machine learning-based modeling can significantly support this effort, with feature based approaches allowing for accurate, efficient, and scalable propagation modeling. Building on previous work, we introduce an extended set of features that improves prediction accuracy while, most importantly, proving model generalization through rigorous statistical assessment and the use of test set holdouts.

Environmental Feature Engineering and Statistical Validation for ML-Based Path Loss Prediction

TL;DR

-style regularization, and validates generalization through six geographically distinct UK holdouts and an intercontinental blind test on Canadian data, achieving RMSE as low as

dB with

. The study demonstrates robustness to initialization and train/validation splits, and shows consistent improvements over prior baselines (e.g., P.1812) while avoiding overfitting despite feature richness. The GIS-based feature engineering enables scalable, environment-aware path loss modeling across

–

GHz without terrain-type labels, with future work targeting diffraction effects and higher-frequency regimes via additional obstruction properties.

Abstract

Paper Structure (16 sections, 3 figures, 5 tables)

This paper contains 16 sections, 3 figures, 5 tables.

Introduction
Proposed Method
Data and Preprocessing
Model Features: Depth, Density, and Distance
Fundamental Features
Obstruction Depth Features
Obstruction Density Features
Obstruction Distance Features
Model Architecture
Training Approach, UK and Canadian Tests
Results and Discussion
Model Performance, UK Blind Tests
Model Performance, Canadian Blind Tests
Assessing Risk of Overfitting
Frequency- and Distance-Dependent Performance
...and 1 more sections

Figures (3)

Figure 1: Path profile with a mixture of buildings (black), terrain (brown) and foliage (green) obstructions with the direct path between transmitter and receiver shown in red. Since the model uses DSM-only to assess obstructions, all obstructions are treated identically, and only DSM statistics matter when computing the eight model features. All depths and distances are measured along the direct path and take into account both angle relative to ground and Earth curvature.
Figure 2: Hexagonal-binning (2-D histogram) plot of predicted vs. measured path loss; colour indicates samples per bin (log scale). Model is trained on all UK drive tests, blind test on Canadian drive tests and has a coefficient of determination $R^{2}$ equal to 0.88 and an RMSE of 6.74 dB.
Figure 3: Mean absolute error on Canadian test data (whiskers for SD of error) as a function of link distance (3 km bins, $>$10000 samples per bin, bottom x-axis) and frequency (no binning, four discrete frequencies, top x-axis)

Environmental Feature Engineering and Statistical Validation for ML-Based Path Loss Prediction

TL;DR

Abstract

Environmental Feature Engineering and Statistical Validation for ML-Based Path Loss Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (3)