Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion

Md. Tahsin Amin; Tanim Ahmmod; Zannatul Ferdus; Talukder Naemul Hasan Naem; Ehsanul Ferdous; Arpita Bhattacharjee; Ishmam Ahmed Solaiman; Nahiyan Bin Noor

Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion

Md. Tahsin Amin, Tanim Ahmmod, Zannatul Ferdus, Talukder Naemul Hasan Naem, Ehsanul Ferdous, Arpita Bhattacharjee, Ishmam Ahmed Solaiman, Nahiyan Bin Noor

TL;DR

This research predicts cardiovascular disease using a merged dataset of 1,190 patient records, comparing traditional machine learning models with open-source large language models via OpenRouter APIs and a hybrid fusion of the ML ensemble and LLM reasoning under Gemini 2.5 Flash.

Abstract

Cardiovascular disease is the primary cause of death globally, necessitating early identification, precise risk classification, and dependable decision-support technologies. The advent of large language models (LLMs) provides new zero-shot and few-shot reasoning capabilities, even though machine learning (ML) algorithms, especially ensemble approaches like Random Forest, XGBoost, LightGBM, and CatBoost, are excellent at modeling complex, non-linear patient data and routinely beat logistic regression. This research predicts cardiovascular disease using a merged dataset of 1,190 patient records, comparing traditional machine learning models (95.78% accuracy, ROC-AUC 0.96) with open-source large language models via OpenRouter APIs. Finally, a hybrid fusion of the ML ensemble and LLM reasoning under Gemini 2.5 Flash achieved the best results (96.62% accuracy, 0.97 AUC), showing that LLMs (78.9 % accuracy) work best when combined with ML models rather than used alone. Results show that ML ensembles achieved the highest performance (95.78% accuracy, ROC-AUC 0.96), while LLMs performed moderately in zero-shot (78.9%) and slightly better in few-shot (72.6%) settings. The proposed hybrid method enhanced the strength in uncertain situations, illustrating that ensemble ML is considered the best structured tabular prediction case, but it can be integrated with hybrid ML-LLM systems to provide a minor increase and open the way to more reliable clinical decision-support tools.

Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion

TL;DR

Abstract

Paper Structure (19 sections, 8 figures, 6 tables)

This paper contains 19 sections, 8 figures, 6 tables.

Introduction
Literature Review
METHODOLOGY
Dataset Description and Preprocessing
Machine Learning Models and Training Procedure
Ensemble Machine Learning Voting Strategy
Zero-shot and few-shot LLMs
LLM Ensemble Voting
Proposed Hybrid ML–LLM Fusion Framework
Result
Performance of Individual Machine Learning Models
Ensemble Machine Learning Results
Large Language Model Predictions
LLM Voting Result
ML–LLM Fusion Pipeline Results and Comparison
...and 4 more sections

Figures (8)

Figure 1: ML Voting Model
Figure 2: Proposed Hybrid ML–LLM Fusion Framework
Figure 3: Model test accuracy comparison for all machine learning models
Figure 4: ROC curves comparison for all machine learning models
Figure 5: Confusion matrices of ensemble predictions: soft voting (left) and hard voting (right)
...and 3 more figures

Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion

TL;DR

Abstract

Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion

Authors

TL;DR

Abstract

Table of Contents

Figures (8)