Enhancing Predictive Accuracy in Pharmaceutical Sales Through An Ensemble Kernel Gaussian Process Regression Approach
Shahin Mirshekari, Mohammadreza Moradi, Hossein Jafari, Mehdi Jafari, Mohammad Ensaf
TL;DR
The paper addresses forecasting pharmaceutical sales time series under pattern heterogeneity and uncertainty. It proposes Gaussian Process Regression with an ensemble kernel that linearly combines Exponential Squared, Revised Matérn, and Rational Quadratic kernels, with kernel weights learned through Bayesian optimization. The ensemble approach achieves near-perfect predictive accuracy (e.g., $R^2 \approx 1.0$) and substantially reduced error metrics compared to individual kernels, demonstrating the value of combining kernels for complex real-world data. The work highlights practical impact for healthcare analytics by enabling robust, uncertainty-aware forecasts in large, heterogeneous sales datasets. The dataset comprises hundreds of thousands of transactions across multiple drug categories, underscoring the method's scalability and relevance to time-series forecasting in the pharmaceutical domain.
Abstract
This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Matérn, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Matérn, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior performance in predictive accuracy, achieving an \( R^2 \) score near 1.0, and significantly lower values in Mean Squared Error (MSE), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). These findings highlight the efficacy of ensemble kernels in GPR for predictive analytics in complex pharmaceutical sales datasets.
