CoRelation: Boosting Automatic ICD Coding Through Contextualized Code Relation Learning
Junyu Luo, Xiaochen Wang, Jiaqi Wang, Aofei Chang, Yaqing Wang, Fenglong Ma
TL;DR
CoRelation tackles automatic ICD coding by learning contextualized, note-specific relationships among ICD codes. It combines contextualized code embeddings, a per-note flexible bipartite graph for relation learning, a graph-transformer update, and a self-adaptive gate to fuse direct and relation-enhanced predictions. A selective training strategy reduces computational cost while maintaining accuracy. Experiments on six public ICD datasets show state-of-the-art performance with substantially fewer parameters than PLM-based methods, and analysis demonstrates interpretable learned code relations.
Abstract
Automatic International Classification of Diseases (ICD) coding plays a crucial role in the extraction of relevant information from clinical notes for proper recording and billing. One of the most important directions for boosting the performance of automatic ICD coding is modeling ICD code relations. However, current methods insufficiently model the intricate relationships among ICD codes and often overlook the importance of context in clinical notes. In this paper, we propose a novel approach, a contextualized and flexible framework, to enhance the learning of ICD code representations. Our approach, unlike existing methods, employs a dependent learning paradigm that considers the context of clinical notes in modeling all possible code relations. We evaluate our approach on six public ICD coding datasets and the experimental results demonstrate the effectiveness of our approach compared to state-of-the-art baselines.
