Table of Contents
Fetching ...

Clustering-based Multitasking Deep Neural Network for Solar Photovoltaics Power Generation Prediction

Hui Song, Zheng Miao, Ali Babalhavaeji, Saman Mehrnia, Mahdi Jalili, Xinghuo Yu

TL;DR

This work tackles PV power forecasting across heterogeneous customer groups by introducing CM-DNN, a clustering-based multitask framework. It partitions data into customer-type clusters with K-means, trains cluster-specific predictors, and then uses PSO-optimized inter-model knowledge transfer to improve each target task via a transferred parameter set $P_i' = \sum_{k\in S_i} \alpha_{i,k} P_k^* + \alpha_{i,i} P_i^*$, enabling selective reuse of learned weights. Empirical results on a real CitiPower/Powercor dataset show CM-DNN outperforms single-model baselines (RNN, CNN-LSTM, LSTM, GRU) across residential, agricultural, industrial, and commercial tasks, with LSTM- and GRU-based CM models achieving the strongest gains. The approach offers a practical means to leverage heterogeneity in PV data while maintaining data de-identification, potentially improving dispatch and energy management in smart grids.

Abstract

The increasing installation of Photovoltaics (PV) cells leads to more generation of renewable energy sources (RES), but results in increased uncertainties of energy scheduling. Predicting PV power generation is important for energy management and dispatch optimization in smart grid. However, the PV power generation data is often collected across different types of customers (e.g., residential, agricultural, industrial, and commercial) while the customer information is always de-identified. This often results in a forecasting model trained with all PV power generation data, allowing the predictor to learn various patterns through intra-model self-learning, instead of constructing a separate predictor for each customer type. In this paper, we propose a clustering-based multitasking deep neural network (CM-DNN) framework for PV power generation prediction. K-means is applied to cluster the data into different customer types. For each type, a deep neural network (DNN) is employed and trained until the accuracy cannot be improved. Subsequently, for a specified customer type (i.e., the target task), inter-model knowledge transfer is conducted to enhance its training accuracy. During this process, source task selection is designed to choose the optimal subset of tasks (excluding the target customer), and each selected source task uses a coefficient to determine the amount of DNN model knowledge (weights and biases) transferred to the aimed prediction task. The proposed CM-DNN is tested on a real-world PV power generation dataset and its superiority is demonstrated by comparing the prediction performance on training the dataset with a single model without clustering.

Clustering-based Multitasking Deep Neural Network for Solar Photovoltaics Power Generation Prediction

TL;DR

This work tackles PV power forecasting across heterogeneous customer groups by introducing CM-DNN, a clustering-based multitask framework. It partitions data into customer-type clusters with K-means, trains cluster-specific predictors, and then uses PSO-optimized inter-model knowledge transfer to improve each target task via a transferred parameter set , enabling selective reuse of learned weights. Empirical results on a real CitiPower/Powercor dataset show CM-DNN outperforms single-model baselines (RNN, CNN-LSTM, LSTM, GRU) across residential, agricultural, industrial, and commercial tasks, with LSTM- and GRU-based CM models achieving the strongest gains. The approach offers a practical means to leverage heterogeneity in PV data while maintaining data de-identification, potentially improving dispatch and energy management in smart grids.

Abstract

The increasing installation of Photovoltaics (PV) cells leads to more generation of renewable energy sources (RES), but results in increased uncertainties of energy scheduling. Predicting PV power generation is important for energy management and dispatch optimization in smart grid. However, the PV power generation data is often collected across different types of customers (e.g., residential, agricultural, industrial, and commercial) while the customer information is always de-identified. This often results in a forecasting model trained with all PV power generation data, allowing the predictor to learn various patterns through intra-model self-learning, instead of constructing a separate predictor for each customer type. In this paper, we propose a clustering-based multitasking deep neural network (CM-DNN) framework for PV power generation prediction. K-means is applied to cluster the data into different customer types. For each type, a deep neural network (DNN) is employed and trained until the accuracy cannot be improved. Subsequently, for a specified customer type (i.e., the target task), inter-model knowledge transfer is conducted to enhance its training accuracy. During this process, source task selection is designed to choose the optimal subset of tasks (excluding the target customer), and each selected source task uses a coefficient to determine the amount of DNN model knowledge (weights and biases) transferred to the aimed prediction task. The proposed CM-DNN is tested on a real-world PV power generation dataset and its superiority is demonstrated by comparing the prediction performance on training the dataset with a single model without clustering.
Paper Structure (14 sections, 8 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 14 sections, 8 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: The overall CM-DNN framework: (a) the diagram of CM-DNN working process; (b) the inter-model knowledge transfer process for a specified type of customers $i, \forall i \in \{1, 2, \dots, n\}$.
  • Figure 2: The structure of a cell in LSTM.
  • Figure 3: Clustering PV power generation into residential, agricultural, industrial, and commercial datasets.
  • Figure 4: Boxplots of RNN, CNNLSTM, LSTM, GRU, CM-RNN, CM-CNNLSTM, CM-LSTM, and CM-GRU over training and testing datasets across residential, agricultural, industrial, and commercial datasets.
  • Figure 5: Average training RMSE of RNN, CNNLSTM, LSTM, GRU, CM-RNN, CM-CNNLSTM, CM-LSTM, and CM-GRU across residential, agricultural, industrial, and commercial datasets.
  • ...and 1 more figures