Table of Contents
Fetching ...

DictLLM: Harnessing Key-Value Data Structures with Large Language Models for Enhanced Medical Diagnostics

YiQiu Guo, Yuchen Yang, Ya Zhang, Yu Wang, Yanfeng Wang

TL;DR

DictLLM is introduced, an innovative framework designed to improve the modeling of key-value structured data, like medical laboratory reports, for generating medical diagnoses, and its exceptional capability in accurately modeling the complex key-value data structure of medical dictionary data is underscored.

Abstract

Structured data offers a sophisticated mechanism for the organization of information. Existing methodologies for the text-serialization of structured data in the context of large language models fail to adequately address the heterogeneity inherent in key-value structured data. These methods are not ideal and frequently result in larger input sizes and poor adaptability to input changes. In this paper, we introduce DictLLM, an innovative framework designed to improve the modeling of key-value structured data, like medical laboratory reports, for generating medical diagnoses. DictLLM integrates three key components: (1) group positional encoding to maintain permutation invariance, (2) hierarchical attention bias to capture the inherent bias in structured data, and (3) an optimal transport alignment layer that aligns the embedding generated by the dictionary encoder with the LLM, thereby producing a sequence of fixed-length virtual tokens. We carry out experiments using various LLM models on a comprehensive real-world medical laboratory report dataset for automatic diagnosis generation, our findings illustrate that DictLLM significantly outperforms established baseline methods and few-shot GPT-4 implementations in terms of both Rouge-L and Knowledge F1 scores. Furthermore, our evaluation of the framework's scalability and robustness, through a series of experiments, underscores its exceptional capability in accurately modeling the complex key-value data structure of medical dictionary data.

DictLLM: Harnessing Key-Value Data Structures with Large Language Models for Enhanced Medical Diagnostics

TL;DR

DictLLM is introduced, an innovative framework designed to improve the modeling of key-value structured data, like medical laboratory reports, for generating medical diagnoses, and its exceptional capability in accurately modeling the complex key-value data structure of medical dictionary data is underscored.

Abstract

Structured data offers a sophisticated mechanism for the organization of information. Existing methodologies for the text-serialization of structured data in the context of large language models fail to adequately address the heterogeneity inherent in key-value structured data. These methods are not ideal and frequently result in larger input sizes and poor adaptability to input changes. In this paper, we introduce DictLLM, an innovative framework designed to improve the modeling of key-value structured data, like medical laboratory reports, for generating medical diagnoses. DictLLM integrates three key components: (1) group positional encoding to maintain permutation invariance, (2) hierarchical attention bias to capture the inherent bias in structured data, and (3) an optimal transport alignment layer that aligns the embedding generated by the dictionary encoder with the LLM, thereby producing a sequence of fixed-length virtual tokens. We carry out experiments using various LLM models on a comprehensive real-world medical laboratory report dataset for automatic diagnosis generation, our findings illustrate that DictLLM significantly outperforms established baseline methods and few-shot GPT-4 implementations in terms of both Rouge-L and Knowledge F1 scores. Furthermore, our evaluation of the framework's scalability and robustness, through a series of experiments, underscores its exceptional capability in accurately modeling the complex key-value data structure of medical dictionary data.
Paper Structure (29 sections, 3 equations, 11 figures, 5 tables, 1 algorithm)

This paper contains 29 sections, 3 equations, 11 figures, 5 tables, 1 algorithm.

Figures (11)

  • Figure 1: Our DictLLM Framework for medical lab report-assisted diagnosis generation. The framework uses a hierarchical dict encoder to encode the medical lab report, and an optimal transport alignment layer to align the embedding generated by the dict encoder and the text encoder.
  • Figure 2: An example of the input and output of the medical lab report-assisted diagnosis generation task.
  • Figure 3: DictLLM Framework for report-assisted diagnosis generation. The medical lab report is first tokenized and encoded by the dict encoder. The embedding generated by the dict encoder are then aligned with the text embedding generated by the large language using the optimal transport alignment layer. The aligned embedding are then fed into the large language model to generate the final diagnosis.
  • Figure 4: Distribution of different types of disease in the dataset.
  • Figure 5: The knowledge F1 score of different methods with respect to the number of input tokens. Other results are detailed in the appendix \ref{['sec:appendix']}.
  • ...and 6 more figures