Adapting Abstract Meaning Representation Parsing to the Clinical Narrative -- the SPRING THYME parser
Jon Z. Cai, Kristin Wright-Bettner, Martha Palmer, Guergana K. Savova, James H. Martin
TL;DR
The work addresses the challenge of parsing clinical narratives into Abstract Meaning Representation by adapting the SPRING AMR parser to the THYME colon cancer corpus. It combines domain-specific annotation with data augmentation and a fine-tuning regime on THYME-AMR data, achieving high performance (SMATCH ~88) on the clinical dataset while managing catastrophic forgetting through joint-domain training. A detailed, fine-grained evaluation reveals substantial improvements in semantic components, especially in named entities and concepts, and demonstrates data-efficient adaptation with relatively small domain-specific datasets. The study highlights the importance of domain-specific AMR resources and augmentation strategies for robust clinical information extraction, with practical implications for scalable semantic interpretation of patient notes.
Abstract
This paper is dedicated to the design and evaluation of the first AMR parser tailored for clinical notes. Our objective was to facilitate the precise transformation of the clinical notes into structured AMR expressions, thereby enhancing the interpretability and usability of clinical text data at scale. Leveraging the colon cancer dataset from the Temporal Histories of Your Medical Events (THYME) corpus, we adapted a state-of-the-art AMR parser utilizing continuous training. Our approach incorporates data augmentation techniques to enhance the accuracy of AMR structure predictions. Notably, through this learning strategy, our parser achieved an impressive F1 score of 88% on the THYME corpus's colon cancer dataset. Moreover, our research delved into the efficacy of data required for domain adaptation within the realm of clinical notes, presenting domain adaptation data requirements for AMR parsing. This exploration not only underscores the parser's robust performance but also highlights its potential in facilitating a deeper understanding of clinical narratives through structured semantic representations.
