Towards Real-world Deployment of NILM Systems: Challenges and Practices
Junyu Xue, Yu Zhang, Xudong Wang, Yi Wang, Guoming Tang
TL;DR
This paper addresses the practical deployment challenges of NILM by introducing a three-tier edge-cloud-client framework that distributes processing across edge and cloud to balance latency and accuracy. It uses an edge-end lightweight XGBoost NILM model and a cloud-end Seq2Point-based deep learning model (with CNN and Transformer) to achieve efficient, accurate appliance decomposition, coordinated via RabbitMQ and deployed with Gunicorn and NGINX for robust concurrency. Experimental results on real-world-like datasets show that edge processing reduces cloud workload and transmission, while cloud processing delivers higher accuracy; the combined framework achieves fast responses under high concurrency. The proposed deployment strategy, including asynchronous middleware and optimized server configurations, offers a scalable path toward practical NILM services with improved security and responsiveness.
Abstract
Non-intrusive load monitoring (NILM), as a key load monitoring technology, can much reduce the deployment cost of traditional power sensors. Previous research has largely focused on developing cloud-exclusive NILM algorithms, which often result in high computation costs and significant service delays. To address these issues, we propose a three-tier framework to enhance the real-world applicability of NILM systems through edge-cloud collaboration. Considering the computational resources available at both the edge and cloud, we implement a lightweight NILM model at the edge and a deep learning based model at the cloud, respectively. In addition to the differential model implementations, we also design a NILM-specific deployment scheme that integrates Gunicorn and NGINX to bridge the gap between theoretical algorithms and practical applications. To verify the effectiveness of the proposed framework, we apply real-world NILM scenario settings and implement the entire process of data acquisition, model training, and system deployment. The results demonstrate that our framework can achieve high decomposition accuracy while significantly reducing the cloud workload and communication overhead under practical considerations.
