Beyond Curve Fitting: Neuro-Symbolic Agents for Context-Aware Epidemic Forecasting
Joongwon Chae, Runming Wang, Chen Xiong, Gong Yunhan, Lian Zhang, Ji Jiansong, Dongmei Yu, Peiwu Qin
TL;DR
This work introduces a hierarchical two-agent framework for HFMD forecasting that couples an LLM-based event interpreter with a probabilistic forecast generator. Agent 1 translates external context (weather, school calendars, surveillance reports) into a Transmission Impact signal with a confidence and narrative rationale, while Agent 2 grounds this signal in historical counts to produce calibrated probabilistic forecasts. Across HFMD datasets from Lishui and Hong Kong, the system achieves competitive point forecasts and robust 90% prediction interval coverage, with notable interpretability through natural-language explanations. The approach demonstrates the value of inference-time contextual reasoning for epidemic forecasting and offers practical benefits for public health decision-making, including improved communication and resource planning.
Abstract
Effective surveillance of hand, foot and mouth disease (HFMD) requires forecasts accounting for epidemiological patterns and contextual drivers like school calendars and weather. While classical models and recent foundation models (e.g., Chronos, TimesFM) incorporate covariates, they often lack the semantic reasoning to interpret the causal interplay between conflicting drivers. In this work, we propose a two-agent framework decoupling contextual interpretation from probabilistic forecasting. An LLM "event interpreter" processes heterogeneous signals-including school schedules, meteorological summaries, and reports-into a scalar transmission-impact signal. A neuro-symbolic core then combines this with historical case counts to produce calibrated probabilistic forecasts. We evaluate the framework on real-world HFMD datasets from Hong Kong (2023-2024) and Lishui, China (2024). Compared to traditional and foundation-model baselines, our approach achieves competitive point forecasting accuracy while providing robust 90% prediction intervals (coverage 0.85-1.00) and human-interpretable rationales. Our results suggest that structurally integrating domain knowledge through LLMs can match state-of-the-art performance while yielding context-aware forecasts that align with public health workflows. Code is available at https://github.com/jw-chae/forecast_MED .
