Table of Contents
Fetching ...

A Novel ICD Coding Method Based on Associated and Hierarchical Code Description Distillation

Bin Zhang, Junli Wang

TL;DR

This paper proposes a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment and shows the superiority of the proposed framework over several state-of-the-art baselines.

Abstract

ICD(International Classification of Diseases) coding involves assigning ICD codes to patients visit based on their medical notes. ICD coding is a challenging multilabel text classification problem due to noisy medical document inputs. Recent advancements in automated ICD coding have enhanced performance by integrating additional data and knowledge bases with the encoding of medical notes and codes. However, most of them ignore the code hierarchy, leading to improper code assignments. To address these problems, we propose a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment.we utilize the code description and the hierarchical structure inherent to the ICD codes. Therefore, in this paper, we leverage the code description and the hierarchical structure inherent to the ICD codes. The code description is also applied to aware the attention layer and output layer. Experimental results on the benchmark dataset show the superiority of the proposed framework over several state-of-the-art baselines.

A Novel ICD Coding Method Based on Associated and Hierarchical Code Description Distillation

TL;DR

This paper proposes a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment and shows the superiority of the proposed framework over several state-of-the-art baselines.

Abstract

ICD(International Classification of Diseases) coding involves assigning ICD codes to patients visit based on their medical notes. ICD coding is a challenging multilabel text classification problem due to noisy medical document inputs. Recent advancements in automated ICD coding have enhanced performance by integrating additional data and knowledge bases with the encoding of medical notes and codes. However, most of them ignore the code hierarchy, leading to improper code assignments. To address these problems, we propose a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment.we utilize the code description and the hierarchical structure inherent to the ICD codes. Therefore, in this paper, we leverage the code description and the hierarchical structure inherent to the ICD codes. The code description is also applied to aware the attention layer and output layer. Experimental results on the benchmark dataset show the superiority of the proposed framework over several state-of-the-art baselines.
Paper Structure (17 sections, 18 equations, 4 figures, 2 tables)

This paper contains 17 sections, 18 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: An example of a medical note annotated with ICD-9 code "285.1", its parent code "285", and the sibling code "285.8". The words highlighted in red represent terms found in the descriptions of the child codes, aiding in the identification of key words within the medical note. Conversely, the words in blue denote terms that differ from the sibling code's description, serving to pinpoint more closely related words, thereby enhancing accuracy.
  • Figure 2: The architecture of proposed AHDD method. $V_d$, $V_{c_A}$, and $V_{c_S}$ represent the label-specific representation for the medical note, associated code, and sibling code, respectively. It is important to highlight that the Backbone Encoder can be implemented using various neural encoders.
  • Figure 3: Micro-averaged F1 for the average note length groups associated with label for CAML, MultiResCNN, LAAT, Fusion, MSMN and Rare-ICD models. The x-axis represents the groups based on average note length, while the y-axis shows the micro-averaged F1 for each of these groups.
  • Figure 4: The attention distribution visualization over a medical note with a medizcal code for Fusion and its counterparts under AHDD method. We highlight the highly weighted words