MedDCR: Learning to Design Agentic Workflows for Medical Coding
Jiyang Zheng, Islam Nassar, Thanh Vu, Xu Zhong, Yang Lin, Tongliang Liu, Long Duong, Yuan-Fang Li
TL;DR
MedDCR addresses the challenge of automated medical coding by learning agentic workflows through a closed-loop design–execute–reflect cycle. It introduces a meta-agent architecture—Designer, Coder, and Reflector—augmented by a memory archive that enables reuse and progressive refinement of coding pipelines under guideline constraints. Experiments on MDACE and ACI-BENCH show MedDCR achieves state-of-the-art performance, e.g., Micro-F1 of 0.51 on MDACE and F1 of 0.52 on ACI-BENCH, with cost-efficient search relative to execution. The work demonstrates that automated workflow optimization yields higher accuracy and greater interpretability than fixed, hand-crafted pipelines, and offers plug-and-play integration with expert workflows for practical deployment in clinical coding.
Abstract
Medical coding converts free-text clinical notes into standardized diagnostic and procedural codes, which are essential for billing, hospital operations, and medical research. Unlike ordinary text classification, it requires multi-step reasoning: extracting diagnostic concepts, applying guideline constraints, mapping to hierarchical codebooks, and ensuring cross-document consistency. Recent advances leverage agentic LLMs, but most rely on rigid, manually crafted workflows that fail to capture the nuance and variability of real-world documentation, leaving open the question of how to systematically learn effective workflows. We present MedDCR, a closed-loop framework that treats workflow design as a learning problem. A Designer proposes workflows, a Coder executes them, and a Reflector evaluates predictions and provides constructive feedback, while a memory archive preserves prior designs for reuse and iterative refinement. On benchmark datasets, MedDCR outperforms state-of-the-art baselines and produces interpretable, adaptable workflows that better reflect real coding practice, improving both the reliability and trustworthiness of automated systems.
