Lightweight and Scalable Transfer Learning Framework for Load Disaggregation

L. E. Garcia-Marrero; G. Petrone; E. Monmasson

Lightweight and Scalable Transfer Learning Framework for Load Disaggregation

L. E. Garcia-Marrero, G. Petrone, E. Monmasson

TL;DR

Experiments demonstrate that RefQuery delivers a strong accuracy-efficiency trade-off against single-appliance and multi-appliance baselines, including modern Transformer-based methods, and support RefQuery as a practical path toward scalable, real-time NILM on resource-constrained edge devices.

Abstract

Non-Intrusive Load Monitoring (NILM) aims to estimate appliance-level consumption from aggregate electrical signals recorded at a single measurement point. In recent years, the field has increasingly adopted deep learning approaches; however, cross-domain generalization remains a persistent challenge due to variations in appliance characteristics, usage patterns, and background loads across homes. Transfer learning provides a practical paradigm to adapt models with limited target data. However, existing methods often assume a fixed appliance set, lack flexibility for evolving real-world deployments, remain unsuitable for edge devices, or scale poorly for real-time operation. This paper proposes RefQuery, a scalable multi-appliance, multi-task NILM framework that conditions disaggregation on compact appliance fingerprints, allowing one shared model to serve many appliances without a fixed output set. RefQuery keeps a pretrained disaggregation network fully frozen and adapts to a target home by learning only a per-appliance embedding during a lightweight backpropagation stage. Experiments on three public datasets demonstrate that RefQuery delivers a strong accuracy-efficiency trade-off against single-appliance and multi-appliance baselines, including modern Transformer-based methods. These results support RefQuery as a practical path toward scalable, real-time NILM on resource-constrained edge devices.

Lightweight and Scalable Transfer Learning Framework for Load Disaggregation

TL;DR

Abstract

Paper Structure (19 sections, 5 equations, 3 figures, 9 tables, 1 algorithm)

This paper contains 19 sections, 5 equations, 3 figures, 9 tables, 1 algorithm.

Introduction
Method
Framework overview
Stage I: Training on the Source Domain
ON Intervals Construction
Training Loss
Stage II: Target Domain Embedding Learning
Stage III: Inference in the Target Domain
Architecture
Lightweight convolutional encoder and embedding generation
Multitask prediction head
Experiments
Experimental Setup
Baselines
Sensitivity analysis
...and 4 more sections

Figures (3)

Figure 1: Three-stage RefQuery framework. Stage I: the model is trained on multiple source buildings using several appliances per building with joint power (MSE) and state (BCE) supervision; Stage II: in the target domain, the model is frozen and a target-specific reference embedding is learned per appliance using joint power (MSE) and state (BCE) supervision; Stage III: target mains is disaggregated into per-appliance state and power using the learned embeddings.
Figure 2: Compact 1D CNN feature extractor for embedding generation.
Figure 3: Multitask head architecture. Given the reference and query embeddings $\mathbf{e}_r,\mathbf{e}_q\in\mathbb{R}^{E}$, the interaction module constructs a joint representation by concatenating $[\mathbf{e}_r,\; \mathbf{e}_q,\; (\mathbf{e}_q-\mathbf{e}_r)^2,\; \mathbf{e}_q \odot \mathbf{e}_r]$, where $\odot$ denotes element-wise multiplication. The resulting $4E$-dimensional vector is processed by two shared fully connected layers with ReLU activations and width $H$ to produce a shared representation. Two task-specific branches are then applied: a classification head (dense + sigmoid) predicts the state probability $\hat{y}_{\text{state}}\in\mathbb{R}$, and a regression head (dense + ReLU) estimates a non-negative magnitude $\Delta z\in\mathbb{R}$. The final standardized power estimate is computed through a gated transformation $\hat{y}_z = \Delta z\,\hat{y}_{\text{state}} + b_{\text{off}}$, where $b_{\text{off}}=-\mu/\sigma$ anchors the OFF operating point in $z$-space.

Lightweight and Scalable Transfer Learning Framework for Load Disaggregation

TL;DR

Abstract

Lightweight and Scalable Transfer Learning Framework for Load Disaggregation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)