Table of Contents
Fetching ...

Opportunities and challenges in the application of large artificial intelligence models in radiology

Liangrui Pan, Zhenyu Zhao, Ying Lu, Kewei Tang, Liyong Fu, Qingchun Liang, Shaoliang Peng

TL;DR

This paper surveys the opportunities and challenges of applying large AI models to radiology, tracing the history, principles, and rising prominence of multimodal and video-based large models. It synthesizes radiology-specific applications across education, report generation, and unimodal/multimodal imaging tasks, highlighting representative models and architectures such as transformer-based LLMs and diffusion-based video generators. Key challenges identified include data quality and annotation reliability, model hallucinations, privacy, interpretability, and integration with clinical workflows and PACS. The discussion underscores initiatives like OpenMEDLab and domain-focused systems (e.g., ELIXR, MAIRA-1, RaDialog, RadLing) as practical steps toward radiology-ready AI assistance in education and reporting, with significant implications for workflow efficiency and patient care.

Abstract

Influenced by ChatGPT, artificial intelligence (AI) large models have witnessed a global upsurge in large model research and development. As people enjoy the convenience by this AI large model, more and more large models in subdivided fields are gradually being proposed, especially large models in radiology imaging field. This article first introduces the development history of large models, technical details, workflow, working principles of multimodal large models and working principles of video generation large models. Secondly, we summarize the latest research progress of AI large models in radiology education, radiology report generation, applications of unimodal and multimodal radiology. Finally, this paper also summarizes some of the challenges of large AI models in radiology, with the aim of better promoting the rapid revolution in the field of radiography.

Opportunities and challenges in the application of large artificial intelligence models in radiology

TL;DR

This paper surveys the opportunities and challenges of applying large AI models to radiology, tracing the history, principles, and rising prominence of multimodal and video-based large models. It synthesizes radiology-specific applications across education, report generation, and unimodal/multimodal imaging tasks, highlighting representative models and architectures such as transformer-based LLMs and diffusion-based video generators. Key challenges identified include data quality and annotation reliability, model hallucinations, privacy, interpretability, and integration with clinical workflows and PACS. The discussion underscores initiatives like OpenMEDLab and domain-focused systems (e.g., ELIXR, MAIRA-1, RaDialog, RadLing) as practical steps toward radiology-ready AI assistance in education and reporting, with significant implications for workflow efficiency and patient care.

Abstract

Influenced by ChatGPT, artificial intelligence (AI) large models have witnessed a global upsurge in large model research and development. As people enjoy the convenience by this AI large model, more and more large models in subdivided fields are gradually being proposed, especially large models in radiology imaging field. This article first introduces the development history of large models, technical details, workflow, working principles of multimodal large models and working principles of video generation large models. Secondly, we summarize the latest research progress of AI large models in radiology education, radiology report generation, applications of unimodal and multimodal radiology. Finally, this paper also summarizes some of the challenges of large AI models in radiology, with the aim of better promoting the rapid revolution in the field of radiography.
Paper Structure (13 sections, 9 figures, 1 table)

This paper contains 13 sections, 9 figures, 1 table.

Figures (9)

  • Figure 1: The basic framework of transformer, including the structure of encoder and decoder and the technical details of the attention mechanism.
  • Figure 2: Flow chart from large model training to fine-tuning
  • Figure 3: The process of multimodal AI large model training and prediction Kruse2022-yo.
  • Figure 4: The inverse process of diffusion learning for noisy images Liu2024-qv.
  • Figure 5: Screen shots depict a typical query conducted on various search engines, including StatDx (a widely-used radiology search engine for analyzing complex or unusual cases) and Google, contrasted with a standard ChatGPT response to a query regarding the "differential diagnosis of a T2 and T1 hyperintense arterially enhancing lesion of the liver". The ChatGPT response is centralized, concise, and suitable for enhancing the learning experience of radiology trainees in the reading room. For the complete conversation, refer to the following link: https://chat.openai.com/share/88f09912-1cef-4cf0-a488-bf97086f9125tippareddy2023radiology.
  • ...and 4 more figures