CrossGP: Cross-Day Glucose Prediction Excluding Physiological Information

Ziyi Zhou; Ming Cheng; Yanjun Cui; Xingjian Diao; Zhaorui Ma

CrossGP: Cross-Day Glucose Prediction Excluding Physiological Information

Ziyi Zhou, Ming Cheng, Yanjun Cui, Xingjian Diao, Zhaorui Ma

TL;DR

This work addresses privacy concerns in glucose prediction by proposing CrossGP, a cross-day glucose predictor that relies solely on external activities rather than private demographic or physiological data. CrossGP uses a dual-branch, multi-scale feature extractor with attention fusion and is trained with a cross-entropy objective to classify next-day glucose into three glycemic-control categories defined by $\text{Good}$ if $TIR>0.7$, $\text{Moderate}$ if $0.55\le TIR\le 0.7$, and $\text{Poor}$ if $TIR<0.55$. The method includes daily record merging and Gaussian-noise data augmentation, and is evaluated on Anderson's diabetes dataset, where CrossGP outperforms Logistic Regression, Random Forest, and XGBoost across accuracy, F1, and recall, demonstrating strong potential for privacy-preserving, clinically useful cross-day glucose forecasting. The results underscore the practical impact of predicting daily glycemic control without sensitive data, supporting future real-world deployments in health monitoring and smart healthcare applications.

Abstract

The increasing number of diabetic patients is a serious issue in society today, which has significant negative impacts on people's health and the country's financial expenditures. Because diabetes may develop into potential serious complications, early glucose prediction for diabetic patients is necessary for timely medical treatment. Existing glucose prediction methods typically utilize patients' private data (e.g. age, gender, ethnicity) and physiological parameters (e.g. blood pressure, heart rate) as reference features for glucose prediction, which inevitably leads to privacy protection concerns. Moreover, these models generally focus on either long-term (monthly-based) or short-term (minute-based) predictions. Long-term prediction methods are generally inaccurate because of the external uncertainties that can greatly affect the glucose values, while short-term ones fail to provide timely medical guidance. Based on the above issues, we propose CrossGP, a novel machine-learning framework for cross-day glucose prediction solely based on the patient's external activities without involving any physiological parameters. Meanwhile, we implement three baseline models for comparison. Extensive experiments on Anderson's dataset strongly demonstrate the superior performance of CrossGP and prove its potential for future real-life applications.

CrossGP: Cross-Day Glucose Prediction Excluding Physiological Information

TL;DR

, and

. The method includes daily record merging and Gaussian-noise data augmentation, and is evaluated on Anderson's diabetes dataset, where CrossGP outperforms Logistic Regression, Random Forest, and XGBoost across accuracy, F1, and recall, demonstrating strong potential for privacy-preserving, clinically useful cross-day glucose forecasting. The results underscore the practical impact of predicting daily glycemic control without sensitive data, supporting future real-world deployments in health monitoring and smart healthcare applications.

Abstract

Paper Structure (18 sections, 5 equations, 5 figures, 3 tables)

This paper contains 18 sections, 5 equations, 5 figures, 3 tables.

Introduction
Methods
Logistic Regression
Random Forest
XGBoost
CrossGP (Ours)
Data Processing
Model Architecture
Experiments
Dataset Description
Dataset Processing and Feature Manipulation
Records Merging
TIR, TBR, and TAR
Label Generation
Evaluation Metrics
...and 3 more sections

Figures (5)

Figure 1: Overview architecture of CrossGP. (1) We pre-process the original data (e.g. records merging) and employ data augmentation to enhance the model's robustness. (2) The processed data is input into the model to predict the corresponding class. The deep/shallow branch specializes in capturing features in multi-scale, followed by an attention layer to fuse the feature. (3) Following the standard criterion, the cross-entropy loss is constructed between predicted categories and the ground truth labels.
Figure 2: Visualization of the CGM data. The percentage of TIR, TBR, and TAR of 30 subjects is shown, where TIR and TAR predominate in the figure.
Figure 3: Visualization of the feature vectors. The distribution of four features (meal, meal bolus, correction bolus, total bolus) of 30 subjects are demonstrated.
Figure 4: Visualization of the glycemic control. The glycemic control situation of an example subject is illustrated, containing good, moderate, and poor situations.
Figure 5: Visualization of feature importance results. The top-3 important features learned by each model are demonstrated, where consistent results can be observed, indicating our theoretical assumption of using external activities for glucose prediction is correct.

CrossGP: Cross-Day Glucose Prediction Excluding Physiological Information

TL;DR

Abstract

CrossGP: Cross-Day Glucose Prediction Excluding Physiological Information

Authors

TL;DR

Abstract

Table of Contents

Figures (5)