MedCoG: Maximizing LLM Inference Density in Medical Reasoning via Meta-Cognitive Regulation
Yu Zhao, Hao Guan, Yongcheng Jing, Ying Zhang, Dacheng Tao
TL;DR
MedCoG introduces a medical meta-cognition framework that regulates LLM reasoning to maximize inference density and efficiency. A Meta-Cognition Regulator assesses problem complexity, familiarity, and knowledge density to steer the Knowledge Executor toward Procedural (SCoT), Factual (KG verification), or Episodic (memory) knowledge, enabling on-demand reasoning. Two metrics, Inference Density and Inference Incremental Efficiency, quantify how effectively additional computation translates into performance gains, and experiments across five hard medical benchmarks show a 5.5x inference density improvement and superior accuracy. The Oracle study underscores the potential upper bound of meta-cognitive regulation, pointing to future work in adaptive calibration and knowledge evolution to further close the gap to ideal strategy selection.
Abstract
Large Language Models (LLMs) have shown strong potential in complex medical reasoning yet face diminishing gains under inference scaling laws. While existing studies augment LLMs with various knowledge types, it remains unclear how effectively the additional costs translate into accuracy. In this paper, we explore how meta-cognition of LLMs, i.e., their self-awareness of their own knowledge states, can regulate the reasoning process. Specifically, we propose MedCoG, a Medical Meta-Cognition Agent with Knowledge Graph, where the meta-cognitive assessments of task complexity, familiarity, and knowledge density dynamically regulate utilization of procedural, episodic, and factual knowledge. The LLM-centric on-demand reasoning aims to mitigate scaling laws by (1) reducing costs via avoiding indiscriminate scaling, (2) improving accuracy via filtering out distractive knowledge. To validate this, we empirically characterize the scaling curve and introduce inference density to quantify inference efficiency, defined as the ratio of theoretically effective cost to actual cost. Experiments demonstrate the effectiveness and efficiency of MedCoG on five hard sets of medical benchmarks, yielding 5.5x inference density. Furthermore, the Oracle study highlights the significant potential of meta-cognitive regulation.
