Table of Contents
Fetching ...

LLM-based Agent Simulation for Maternal Health Interventions: Uncertainty Estimation and Decision-focused Evaluation

Sarah Martinson, Lingkai Kong, Cheol Woo Kim, Aparna Taneja, Milind Tambe

TL;DR

Data-scarce health settings hinder traditional ABMs, so this paper tests LLM-based agent simulations to predict engagement in maternal health programs. It introduces an epistemic uncertainty estimation method based on binary entropy across multiple LLM samples and evaluates three ensembling strategies to improve robustness and calibration. A decision-focused evaluation pipeline is used to gauge how predictions inform intervention feasibility and trial design under limited data. The approach generalizes to public health and disaster response where rapid, data-efficient decision-support is critical.

Abstract

Agent-based simulation is crucial for modeling complex human behavior, yet traditional approaches require extensive domain knowledge and large datasets. In data-scarce healthcare settings where historic and counterfactual data are limited, large language models (LLMs) offer a promising alternative by leveraging broad world knowledge. This study examines an LLM-driven simulation of a maternal mobile health program, predicting beneficiaries' listening behavior when they receive health information via automated messages (control) or live representatives (intervention). Since uncertainty quantification is critical for decision-making in health interventions, we propose an LLM epistemic uncertainty estimation method based on binary entropy across multiple samples. We enhance model robustness through ensemble approaches, improving F1 score and model calibration compared to individual models. Beyond direct evaluation, we take a decision-focused approach, demonstrating how LLM predictions inform intervention feasibility and trial implementation in data-limited settings. The proposed method extends to public health, disaster response, and other domains requiring rapid intervention assessment under severe data constraints. All code and prompts used for this work can be found at https://github.com/sarahmart/LLM-ABS-ARMMAN-prediction.

LLM-based Agent Simulation for Maternal Health Interventions: Uncertainty Estimation and Decision-focused Evaluation

TL;DR

Data-scarce health settings hinder traditional ABMs, so this paper tests LLM-based agent simulations to predict engagement in maternal health programs. It introduces an epistemic uncertainty estimation method based on binary entropy across multiple LLM samples and evaluates three ensembling strategies to improve robustness and calibration. A decision-focused evaluation pipeline is used to gauge how predictions inform intervention feasibility and trial design under limited data. The approach generalizes to public health and disaster response where rapid, data-efficient decision-support is critical.

Abstract

Agent-based simulation is crucial for modeling complex human behavior, yet traditional approaches require extensive domain knowledge and large datasets. In data-scarce healthcare settings where historic and counterfactual data are limited, large language models (LLMs) offer a promising alternative by leveraging broad world knowledge. This study examines an LLM-driven simulation of a maternal mobile health program, predicting beneficiaries' listening behavior when they receive health information via automated messages (control) or live representatives (intervention). Since uncertainty quantification is critical for decision-making in health interventions, we propose an LLM epistemic uncertainty estimation method based on binary entropy across multiple samples. We enhance model robustness through ensemble approaches, improving F1 score and model calibration compared to individual models. Beyond direct evaluation, we take a decision-focused approach, demonstrating how LLM predictions inform intervention feasibility and trial implementation in data-limited settings. The proposed method extends to public health, disaster response, and other domains requiring rapid intervention assessment under severe data constraints. All code and prompts used for this work can be found at https://github.com/sarahmart/LLM-ABS-ARMMAN-prediction.

Paper Structure

This paper contains 35 sections, 5 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Mean accuracy of component and ensemble models over time, grouped by provider.
  • Figure 2: Mean F1 score of component and ensemble models over time, grouped by provider.
  • Figure 3: Mean log-likelihood of component and ensemble models over time, grouped by provider.
  • Figure 4: Average total model accuracy by sociodemographic feature for component and ensemble models.
  • Figure 5: Mean engagement proportion over time in (left to right) the intervention, counterfactual and control settings.
  • ...and 2 more figures