Table of Contents
Fetching ...

Predicting Battery Capacity Fade Using Probabilistic Machine Learning Models With and Without Pre-Trained Priors

Michael J. Kenney, Katerina G. Malollari, Sergei V. Kalinin, Maxim Ziatdinov

TL;DR

It is shown that pre-training can be leveraged for GP and sGP approaches to learn the prior distributions of the hyperparameters and that in the case of the pre-trained sGP, similar accuracy and improved uncertainty estimation compared to the BNN can be achieved.

Abstract

Lithium-ion batteries are a key energy storage technology driving revolutions in mobile electronics, electric vehicles and renewable energy storage. Capacity retention is a vital performance measure that is frequently utilized to assess whether these batteries have approached their end-of-life. Machine learning (ML) offers a powerful tool for predicting capacity degradation based on past data, and, potentially, prior physical knowledge, but the degree to which an ML prediction can be trusted is of significant practical importance in situations where consequential decisions must be made based on battery state of health. This study explores the efficacy of fully Bayesian machine learning in forecasting battery health with the quantification of uncertainty in its predictions. Specifically, we implemented three probabilistic ML approaches and evaluated the accuracy of their predictions and uncertainty estimates: a standard Gaussian process (GP), a structured Gaussian process (sGP), and a fully Bayesian neural network (BNN). In typical applications of GP and sGP, their hyperparameters are learned from a single sample while, in contrast, BNNs are typically pre-trained on an existing dataset to learn the weight distributions before being used for inference. This difference in methodology gives the BNN an advantage in learning global trends in a dataset and makes BNNs a good choice when training data is available. However, we show that pre-training can also be leveraged for GP and sGP approaches to learn the prior distributions of the hyperparameters and that in the case of the pre-trained sGP, similar accuracy and improved uncertainty estimation compared to the BNN can be achieved. This approach offers a framework for a broad range of probabilistic machine learning scenarios where past data is available and can be used to learn priors for (hyper)parameters of probabilistic ML models.

Predicting Battery Capacity Fade Using Probabilistic Machine Learning Models With and Without Pre-Trained Priors

TL;DR

It is shown that pre-training can be leveraged for GP and sGP approaches to learn the prior distributions of the hyperparameters and that in the case of the pre-trained sGP, similar accuracy and improved uncertainty estimation compared to the BNN can be achieved.

Abstract

Lithium-ion batteries are a key energy storage technology driving revolutions in mobile electronics, electric vehicles and renewable energy storage. Capacity retention is a vital performance measure that is frequently utilized to assess whether these batteries have approached their end-of-life. Machine learning (ML) offers a powerful tool for predicting capacity degradation based on past data, and, potentially, prior physical knowledge, but the degree to which an ML prediction can be trusted is of significant practical importance in situations where consequential decisions must be made based on battery state of health. This study explores the efficacy of fully Bayesian machine learning in forecasting battery health with the quantification of uncertainty in its predictions. Specifically, we implemented three probabilistic ML approaches and evaluated the accuracy of their predictions and uncertainty estimates: a standard Gaussian process (GP), a structured Gaussian process (sGP), and a fully Bayesian neural network (BNN). In typical applications of GP and sGP, their hyperparameters are learned from a single sample while, in contrast, BNNs are typically pre-trained on an existing dataset to learn the weight distributions before being used for inference. This difference in methodology gives the BNN an advantage in learning global trends in a dataset and makes BNNs a good choice when training data is available. However, we show that pre-training can also be leveraged for GP and sGP approaches to learn the prior distributions of the hyperparameters and that in the case of the pre-trained sGP, similar accuracy and improved uncertainty estimation compared to the BNN can be achieved. This approach offers a framework for a broad range of probabilistic machine learning scenarios where past data is available and can be used to learn priors for (hyper)parameters of probabilistic ML models.
Paper Structure (13 sections, 12 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 13 sections, 12 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: (A) 10 randomly selected curves (out of 250) from the simulated dataset and (B) 10 randomly selected curves (out of 125) from the experimental dataset.
  • Figure 2: Schematic illustration of three different approaches for forecasting battery performance. (A) Model is trained on the part of the curve for which data is available and is used to forecast battery state of health on the remaining part. (B) Model weights are pre-trained off-line using available data to forecast the battery state of health from partially measured data. It is then deployed for new curves as is. (C) Model priors are pre-trained offline using available data and used to set-up new priors for a single-shot model from A.
  • Figure 3: Average MAPE (A) and average NLPD (B) for all six models on the simulated test dataset from 50%-80% context length. SOH vs normalized cycles with posterior mean and 95% confidence intervals on simulated testing data for models without pre-training (C) and with pre-training (D).
  • Figure 4: Average MAPE (A) and average NLPD (B) for all six models on the experimental test dataset from 50%-80% context length. SOH vs normalized cycles with posterior mean and 95% confidence intervals on experimental testing data for models without pre-training (C) and with pre-training (D).
  • Figure 5: Structured GP posterior mean and 95% confidence intervals for 50%-80% context length without pre-training (A), with pre-training (B) on the experimental dataset and without pre-training (C) and with pre-training (D) on the simulated dataset.