Effective Feature Selection for Predicting Spreading Factor with ML in Large LoRaWAN-based Mobile IoT Networks
Aman Prakash, Nikumani Choudhury, Anakhi Hazarika, Alekhya Gorrela
TL;DR
This work tackles the challenge of predicting LoRaWAN spreading factor (SF) using machine learning to optimize SF allocation in large mobile IoT networks. It exhaustively evaluates 31 feature combinations derived from five LoRaWAN features across four classifiers (k-NN, Decision Tree, Random Forest, Multinomial Logistic Regression) on a large public dataset, identifying RSSI and SNR as the most informative pair. The results show that the RSSI+SNR combination achieves strong predictive performance, enabling reductions in data-collection costs and training time while extending device battery life. The findings offer practical guidance for efficient SF allocation in LoRaWAN systems and point to avenues for future improvements with deeper models and broader feature sets.
Abstract
LoRaWAN is a low-power long-range protocol that enables reliable and robust communication. This paper addresses the challenge of predicting the spreading factor (SF) in LoRaWAN networks using machine learning (ML) techniques. Optimal SF allocation is crucial for optimizing data transmission in IoT-enabled mobile devices, yet it remains a challenging task due to the fluctuation in environment and network conditions. We evaluated ML model performance across a large publicly available dataset to explore the best feature across key LoRaWAN features such as RSSI, SNR, frequency, distance between end devices and gateways, and antenna height of the end device, further, we also experimented with 31 different combinations possible for 5 features. We trained and evaluated the model using k-nearest neighbors (k-NN), Decision Tree Classifier (DTC), Random Forest (RF), and Multinomial Logistic Regression (MLR) algorithms. The combination of RSSI and SNR was identified as the best feature set. The finding of this paper provides valuable information for reducing the overall cost of dataset collection for ML model training and extending the battery life of LoRaWAN devices. This work contributes to a more reliable LoRaWAN system by understanding the importance of specific feature sets for optimized SF allocation.
