FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis

Xiaotian Hu; Junwei Huang; Mingxuan Liu; Kasidit Anmahapong; Yifei Chen; Yitong Luo; Yiming Huang; Xuguang Bai; Zihan Li; Yi Liao; Haibo Qu; Qiyuan Tian

FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis

Xiaotian Hu, Junwei Huang, Mingxuan Liu, Kasidit Anmahapong, Yifei Chen, Yitong Luo, Yiming Huang, Xuguang Bai, Zihan Li, Yi Liao, Haibo Qu, Qiyuan Tian

TL;DR

FetalAgents is proposed, the first multi-agent system for comprehensive fetal US analysis, which dynamically orchestrates specialized vision experts to maximize performance across diagnosis, measurement, and segmentation and provides an auditable, workflow-aligned solution.

Abstract

Fetal ultrasound (US) is the primary imaging modality for prenatal screening, yet its interpretation relies heavily on the expertise of the clinician. Despite advances in deep learning and foundation models, existing automated tools for fetal US analysis struggle to balance task-specific accuracy with the whole-process versatility required to support end-to-end clinical workflows. To address these limitations, we propose FetalAgents, the first multi-agent system for comprehensive fetal US analysis. Through a lightweight, agentic coordination framework, FetalAgents dynamically orchestrates specialized vision experts to maximize performance across diagnosis, measurement, and segmentation. Furthermore, FetalAgents advances beyond static image analysis by supporting end-to-end video stream summarization, where keyframes are automatically identified across multiple anatomical planes, analyzed by coordinated experts, and synthesized with patient metadata into a structured clinical report. Extensive multi-center external evaluations across eight clinical tasks demonstrate that FetalAgents consistently delivers the most robust and accurate performance when compared against specialized models and multimodal large language models (MLLMs), ultimately providing an auditable, workflow-aligned solution for fetal ultrasound analysis and reporting.

FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis

TL;DR

Abstract

Paper Structure (10 sections, 2 figures, 1 table)

This paper contains 10 sections, 2 figures, 1 table.

Introduction
Methods
Overview of FetalAgents
Fetal Ultrasound Analysis
Comprehensive Clinical Workflow
Experiments
Datasets and Experimental Setup
Quantitative Results
Image Captioning and Video Summarization Results
Conclusion

Figures (2)

Figure 1: System diagram of FetalAgents. The framework comprises three distinct modules: (1) Agentic Clinical Workflow, where the Coordinator interprets queries and dispatches tasks; (2) Image Analysis, where specialized experts execute specific vision tasks (classification, segmentation, biometry); and (3) Video Summary, which aggregates multi-plane keyframes and temporal findings into a comprehensive report.
Figure 2: Qualitative evaluation of clinical reporting. (a) Image Caption Generation: FetalAgents correctly identifies the standard plane and performs biometry to generate a report concordant with human experts. (b) Video Summarization: The system autonomously extracts multi-plane keyframes from a continuous video stream and cross-references ultrasound findings with patient metadata (LMP) for clinical consistency.

FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis

TL;DR

Abstract

FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (2)