Table of Contents
Fetching ...

Learning to fuse: dynamic integration of multi-source data for accurate battery lifespan prediction

He Shanxuan, Lin Zuhong, Yu Bolun, Gao Xu, Long Biao, Yao Jingjing

TL;DR

The paper addresses accurate lithium-ion battery lifespan prediction by integrating dynamic multi-source data fusion within a stacked ensemble of Ridge regression, LSTM, and XGBoost, guided by an entropy-based weighting scheme to mitigate cross-dataset heterogeneity. It achieves high predictive accuracy with a mean absolute error of 0.0058, a root mean square error of 0.0092, and an R^2 of 0.9839, outperforming baselines with substantial improvements, and uses SHAP explanations to reveal $Qdlin$ and $Temp_m$ as key aging indicators. The framework is designed to be scalable and interpretable, enabling battery health management across diverse chemistries and datasets. This supports optimized maintenance and safety across energy storage systems by delivering precise lifespan forecasts from heterogeneous sources.

Abstract

Accurate prediction of lithium-ion battery lifespan is vital for ensuring operational reliability and reducing maintenance costs in applications like electric vehicles and smart grids. This study presents a hybrid learning framework for precise battery lifespan prediction, integrating dynamic multi-source data fusion with a stacked ensemble (SE) modeling approach. By leveraging heterogeneous datasets from the National Aeronautics and Space Administration (NASA), Center for Advanced Life Cycle Engineering (CALCE), MIT-Stanford-Toyota Research Institute (TRC), and nickel cobalt aluminum (NCA) chemistries, an entropy-based dynamic weighting mechanism mitigates variability across heterogeneous datasets. The SE model combines Ridge regression, long short-term memory (LSTM) networks, and eXtreme Gradient Boosting (XGBoost), effectively capturing temporal dependencies and nonlinear degradation patterns. It achieves a mean absolute error (MAE) of 0.0058, root mean square error (RMSE) of 0.0092, and coefficient of determination (R2) of 0.9839, outperforming established baseline models with a 46.2% improvement in R2 and an 83.2% reduction in RMSE. Shapley additive explanations (SHAP) analysis identifies differential discharge capacity (Qdlin) and temperature of measurement (Temp_m) as critical aging indicators. This scalable, interpretable framework enhances battery health management, supporting optimized maintenance and safety across diverse energy storage systems, thereby contributing to improved battery health management in energy storage systems.

Learning to fuse: dynamic integration of multi-source data for accurate battery lifespan prediction

TL;DR

The paper addresses accurate lithium-ion battery lifespan prediction by integrating dynamic multi-source data fusion within a stacked ensemble of Ridge regression, LSTM, and XGBoost, guided by an entropy-based weighting scheme to mitigate cross-dataset heterogeneity. It achieves high predictive accuracy with a mean absolute error of 0.0058, a root mean square error of 0.0092, and an R^2 of 0.9839, outperforming baselines with substantial improvements, and uses SHAP explanations to reveal and as key aging indicators. The framework is designed to be scalable and interpretable, enabling battery health management across diverse chemistries and datasets. This supports optimized maintenance and safety across energy storage systems by delivering precise lifespan forecasts from heterogeneous sources.

Abstract

Accurate prediction of lithium-ion battery lifespan is vital for ensuring operational reliability and reducing maintenance costs in applications like electric vehicles and smart grids. This study presents a hybrid learning framework for precise battery lifespan prediction, integrating dynamic multi-source data fusion with a stacked ensemble (SE) modeling approach. By leveraging heterogeneous datasets from the National Aeronautics and Space Administration (NASA), Center for Advanced Life Cycle Engineering (CALCE), MIT-Stanford-Toyota Research Institute (TRC), and nickel cobalt aluminum (NCA) chemistries, an entropy-based dynamic weighting mechanism mitigates variability across heterogeneous datasets. The SE model combines Ridge regression, long short-term memory (LSTM) networks, and eXtreme Gradient Boosting (XGBoost), effectively capturing temporal dependencies and nonlinear degradation patterns. It achieves a mean absolute error (MAE) of 0.0058, root mean square error (RMSE) of 0.0092, and coefficient of determination (R2) of 0.9839, outperforming established baseline models with a 46.2% improvement in R2 and an 83.2% reduction in RMSE. Shapley additive explanations (SHAP) analysis identifies differential discharge capacity (Qdlin) and temperature of measurement (Temp_m) as critical aging indicators. This scalable, interpretable framework enhances battery health management, supporting optimized maintenance and safety across diverse energy storage systems, thereby contributing to improved battery health management in energy storage systems.

Paper Structure

This paper contains 3 sections.