Table of Contents
Fetching ...

Proof-of-TBI -- Fine-Tuned Vision Language Model Consortium and OpenAI-o3 Reasoning LLM-Based Medical Diagnosis Support System for Mild Traumatic Brain Injury (TBI) Prediction

Ross Gore, Eranga Bandara, Sachin Shetty, Alberto E. Musto, Pratip Rana, Ambrosio Valencia-Romero, Christopher Rhea, Lobat Tayebi, Heather Richter, Atmaram Yarlagadda, Donna Edmonds, Steven Wallace, Donna Broshek

TL;DR

This work tackles the challenge of detecting mild TBI from MRI, where symptoms can be subtle and difficult to interpret. It introduces Proof-of-TBI, a platform that fuses a consortium of fine-tuned vision-language models with the OpenAI-o3 reasoning LLM, orchestrated by LLM agents for end-to-end automation. The system leverages a Data Lake for data management, 4-bit quantized QLoRA fine-tuning on consumer hardware, and a consensus-based VLM ensemble whose outputs are reasoned over by OpenAI-o3 to yield a final diagnosis. Developed in collaboration with the U.S. Army Medical Research team, the approach aims to deliver robust, secure, and transparent TBI predictions and paves the way for applying similar paradigms to other medical-imaging tasks.

Abstract

Mild Traumatic Brain Injury (TBI) detection presents significant challenges due to the subtle and often ambiguous presentation of symptoms in medical imaging, making accurate diagnosis a complex task. To address these challenges, we propose Proof-of-TBI, a medical diagnosis support system that integrates multiple fine-tuned vision-language models with the OpenAI-o3 reasoning large language model (LLM). Our approach fine-tunes multiple vision-language models using a labeled dataset of TBI MRI scans, training them to diagnose TBI symptoms effectively. The predictions from these models are aggregated through a consensus-based decision-making process. The system evaluates the predictions from all fine-tuned vision language models using the OpenAI-o3 reasoning LLM, a model that has demonstrated remarkable reasoning performance, to produce the most accurate final diagnosis. The LLM Agents orchestrates interactions between the vision-language models and the reasoning LLM, managing the final decision-making process with transparency, reliability, and automation. This end-to-end decision-making workflow combines the vision-language model consortium with the OpenAI-o3 reasoning LLM, enabled by custom prompt engineering by the LLM agents. The prototype for the proposed platform was developed in collaboration with the U.S. Army Medical Research team in Newport News, Virginia, incorporating five fine-tuned vision-language models. The results demonstrate the transformative potential of combining fine-tuned vision-language model inputs with the OpenAI-o3 reasoning LLM to create a robust, secure, and highly accurate diagnostic system for mild TBI prediction. To the best of our knowledge, this research represents the first application of fine-tuned vision-language models integrated with a reasoning LLM for TBI prediction tasks.

Proof-of-TBI -- Fine-Tuned Vision Language Model Consortium and OpenAI-o3 Reasoning LLM-Based Medical Diagnosis Support System for Mild Traumatic Brain Injury (TBI) Prediction

TL;DR

This work tackles the challenge of detecting mild TBI from MRI, where symptoms can be subtle and difficult to interpret. It introduces Proof-of-TBI, a platform that fuses a consortium of fine-tuned vision-language models with the OpenAI-o3 reasoning LLM, orchestrated by LLM agents for end-to-end automation. The system leverages a Data Lake for data management, 4-bit quantized QLoRA fine-tuning on consumer hardware, and a consensus-based VLM ensemble whose outputs are reasoned over by OpenAI-o3 to yield a final diagnosis. Developed in collaboration with the U.S. Army Medical Research team, the approach aims to deliver robust, secure, and transparent TBI predictions and paves the way for applying similar paradigms to other medical-imaging tasks.

Abstract

Mild Traumatic Brain Injury (TBI) detection presents significant challenges due to the subtle and often ambiguous presentation of symptoms in medical imaging, making accurate diagnosis a complex task. To address these challenges, we propose Proof-of-TBI, a medical diagnosis support system that integrates multiple fine-tuned vision-language models with the OpenAI-o3 reasoning large language model (LLM). Our approach fine-tunes multiple vision-language models using a labeled dataset of TBI MRI scans, training them to diagnose TBI symptoms effectively. The predictions from these models are aggregated through a consensus-based decision-making process. The system evaluates the predictions from all fine-tuned vision language models using the OpenAI-o3 reasoning LLM, a model that has demonstrated remarkable reasoning performance, to produce the most accurate final diagnosis. The LLM Agents orchestrates interactions between the vision-language models and the reasoning LLM, managing the final decision-making process with transparency, reliability, and automation. This end-to-end decision-making workflow combines the vision-language model consortium with the OpenAI-o3 reasoning LLM, enabled by custom prompt engineering by the LLM agents. The prototype for the proposed platform was developed in collaboration with the U.S. Army Medical Research team in Newport News, Virginia, incorporating five fine-tuned vision-language models. The results demonstrate the transformative potential of combining fine-tuned vision-language model inputs with the OpenAI-o3 reasoning LLM to create a robust, secure, and highly accurate diagnostic system for mild TBI prediction. To the best of our knowledge, this research represents the first application of fine-tuned vision-language models integrated with a reasoning LLM for TBI prediction tasks.

Paper Structure

This paper contains 16 sections, 13 figures, 1 table.

Figures (13)

  • Figure 1: Proof-of-TBI platform layered architecture.
  • Figure 2: Fine-tune Vision LLMs with Qlora and deploy with Ollama.
  • Figure 3: Vision LLM integration flow with Ollama LLM-API, LlamaIndex, LangChain and Smart Contracts.
  • Figure 4: The required data format of the unsloth library to fine-tune the vision language model.
  • Figure 5: Prompt for OpenAI-o3 reasoning LLM for final diagnosis reasoning.
  • ...and 8 more figures