Almanac Copilot: Towards Autonomous Electronic Health Record Navigation
Cyril Zakka, Joseph Cho, Gracia Fahed, Rohan Shad, Michael Moor, Robyn Fong, Dhamanpreet Kaur, Vishnu Ravi, Oliver Aalami, Roxana Daneshjou, Akshay Chaudhari, William Hiesinger
TL;DR
Almanac Copilot tackles clinician burnout by enabling autonomous navigation of electronic health records through a tool-using large language model. It combines a 33B parameter instruction-tuned LLM, Matryoshka embeddings, and external tools to perform information retrieval, data manipulation, and alert surfacing within a privacy-preserving, locally executed EHR environment. Evaluation on the EHR-QA benchmark (300 synthetic, physician-authored queries) shows Almanac Copilot achieving a 74% task-success rate with a mean score of 2.45 out of 3, highlighting potential to streamline workflows while acknowledging hallucination risks and the need for further improvements toward Level 2 autonomy and multi-modal data handling. The work demonstrates the practical impact of clinically aligned AI agents in reducing cognitive load and improving efficiency in real-world EMR use, with implications for safer and more scalable deployment in healthcare settings.
Abstract
Clinicians spend large amounts of time on clinical documentation, and inefficiencies impact quality of care and increase clinician burnout. Despite the promise of electronic medical records (EMR), the transition from paper-based records has been negatively associated with clinician wellness, in part due to poor user experience, increased burden of documentation, and alert fatigue. In this study, we present Almanac Copilot, an autonomous agent capable of assisting clinicians with EMR-specific tasks such as information retrieval and order placement. On EHR-QA, a synthetic evaluation dataset of 300 common EHR queries based on real patient data, Almanac Copilot obtains a successful task completion rate of 74% (n = 221 tasks) with a mean score of 2.45 over 3 (95% CI:2.34-2.56). By automating routine tasks and streamlining the documentation process, our findings highlight the significant potential of autonomous agents to mitigate the cognitive load imposed on clinicians by current EMR systems.
