Table of Contents
Fetching ...

Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical Domain

Yanrui Du, Sendong Zhao, Muzhen Cai, Ming Ma, Danyang Zhao, Jiawei Cao, Bing Qin

TL;DR

This work introduces the notion of dual logic ability for LLMs and investigates how privatization in the medical domain affects this core reasoning capability. By constructing a data-intensive pipeline that combines medical literature abstracts, multi-round dialogues, and manually annotated dual-logic test data, the authors quantify stance consistency on paired prompts and reveal a substantial degradation in dual logic after privatization. They show that incorporating general-domain dual-logic data during instruction fine-tuning can restore and even improve dual logic alongside accuracy, with pre-training data acting as an enhancing factor. The study also demonstrates transferability of general-domain dual-logic data to other privatized datasets and highlights the influence of the base LLM on downstream dual-logic performance, providing a new benchmark and practical guidance for privatization strategies in real-world medical AI applications.

Abstract

Extensive studies have been devoted to privatizing general-domain Large Language Models (LLMs) as Domain-Specific LLMs via feeding specific-domain data. However, these privatization efforts often ignored a critical aspect: Dual Logic Ability, which is a core reasoning ability for LLMs. The dual logic ability of LLMs ensures that they can maintain a consistent stance when confronted with both positive and negative statements about the same fact. Our study focuses on how the dual logic ability of LLMs is affected during the privatization process in the medical domain. We conduct several experiments to analyze the dual logic ability of LLMs by examining the consistency of the stance in responses to paired questions about the same fact. In our experiments, interestingly, we observed a significant decrease in the dual logic ability of existing LLMs after privatization. Besides, our results indicate that incorporating general domain dual logic data into LLMs not only enhances LLMs' dual logic ability but also further improves their accuracy. These findings underscore the importance of prioritizing LLMs' dual logic ability during the privatization process. Our study establishes a benchmark for future research aimed at exploring LLMs' dual logic ability during the privatization process and offers valuable guidance for privatization efforts in real-world applications.

Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical Domain

TL;DR

This work introduces the notion of dual logic ability for LLMs and investigates how privatization in the medical domain affects this core reasoning capability. By constructing a data-intensive pipeline that combines medical literature abstracts, multi-round dialogues, and manually annotated dual-logic test data, the authors quantify stance consistency on paired prompts and reveal a substantial degradation in dual logic after privatization. They show that incorporating general-domain dual-logic data during instruction fine-tuning can restore and even improve dual logic alongside accuracy, with pre-training data acting as an enhancing factor. The study also demonstrates transferability of general-domain dual-logic data to other privatized datasets and highlights the influence of the base LLM on downstream dual-logic performance, providing a new benchmark and practical guidance for privatization strategies in real-world medical AI applications.

Abstract

Extensive studies have been devoted to privatizing general-domain Large Language Models (LLMs) as Domain-Specific LLMs via feeding specific-domain data. However, these privatization efforts often ignored a critical aspect: Dual Logic Ability, which is a core reasoning ability for LLMs. The dual logic ability of LLMs ensures that they can maintain a consistent stance when confronted with both positive and negative statements about the same fact. Our study focuses on how the dual logic ability of LLMs is affected during the privatization process in the medical domain. We conduct several experiments to analyze the dual logic ability of LLMs by examining the consistency of the stance in responses to paired questions about the same fact. In our experiments, interestingly, we observed a significant decrease in the dual logic ability of existing LLMs after privatization. Besides, our results indicate that incorporating general domain dual logic data into LLMs not only enhances LLMs' dual logic ability but also further improves their accuracy. These findings underscore the importance of prioritizing LLMs' dual logic ability during the privatization process. Our study establishes a benchmark for future research aimed at exploring LLMs' dual logic ability during the privatization process and offers valuable guidance for privatization efforts in real-world applications.
Paper Structure (23 sections, 1 equation, 9 figures, 2 tables)

This paper contains 23 sections, 1 equation, 9 figures, 2 tables.

Figures (9)

  • Figure 1: An example where LLM maintains a consistent stance when faced with $Q$ and $\neg Q$. Antonym substitution represents the operation of converting $Q$ into $\neg Q$.
  • Figure 2: A case of constructed dialogue. The left side shows the literature abstract and the right side shows the constructed dialogue. The text covered in green color represents information from the literature abstract.
  • Figure 3: Cases of our annotated dual logic test data. AS, ANW, and SEP represent antonym substitution, adding negative words, and swapping entity positions operations respectively.
  • Figure 4: Process of constructing general domain dual logic data.
  • Figure 5: Huamn evaluation criteria. In our evaluation, LLMs' replies will be classified into five categories.
  • ...and 4 more figures