Table of Contents
Fetching ...

Leveraging Large Language Models for Analyzing Blood Pressure Variations Across Biological Sex from Scientific Literature

Yuting Guo, Seyedeh Somayyeh Mousavi, Reza Sameni, Abeed Sarker

TL;DR

This work addresses biases in blood pressure measurement by leveraging a large language model to extract sex-specific BP statistics from the biomedical literature. Using zero-shot prompting with GPT-3.5-turbo, the authors retrieve mean and standard deviation of SBP and DBP for males and females from 993 PubMed abstracts (out of ~25 million) and analyze distributional differences via heatmaps and Gaussian mixture models. The study demonstrates the feasibility of large-scale, literature-based BP analysis and finds that males generally exhibit higher BP than females, while also highlighting limitations such as potential hallucinations and reliance on abstracts. This approach offers a scalable pathway to assemble heterogeneous BP datasets from published studies, potentially informing clinical benchmarks and demographic considerations in BP assessment.

Abstract

Hypertension, defined as blood pressure (BP) that is above normal, holds paramount significance in the realm of public health, as it serves as a critical precursor to various cardiovascular diseases (CVDs) and significantly contributes to elevated mortality rates worldwide. However, many existing BP measurement technologies and standards might be biased because they do not consider clinical outcomes, comorbidities, or demographic factors, making them inconclusive for diagnostic purposes. There is limited data-driven research focused on studying the variance in BP measurements across these variables. In this work, we employed GPT-35-turbo, a large language model (LLM), to automatically extract the mean and standard deviation values of BP for both males and females from a dataset comprising 25 million abstracts sourced from PubMed. 993 article abstracts met our predefined inclusion criteria (i.e., presence of references to blood pressure, units of blood pressure such as mmHg, and mention of biological sex). Based on the automatically-extracted information from these articles, we conducted an analysis of the variations of BP values across biological sex. Our results showed the viability of utilizing LLMs to study the BP variations across different demographic factors.

Leveraging Large Language Models for Analyzing Blood Pressure Variations Across Biological Sex from Scientific Literature

TL;DR

This work addresses biases in blood pressure measurement by leveraging a large language model to extract sex-specific BP statistics from the biomedical literature. Using zero-shot prompting with GPT-3.5-turbo, the authors retrieve mean and standard deviation of SBP and DBP for males and females from 993 PubMed abstracts (out of ~25 million) and analyze distributional differences via heatmaps and Gaussian mixture models. The study demonstrates the feasibility of large-scale, literature-based BP analysis and finds that males generally exhibit higher BP than females, while also highlighting limitations such as potential hallucinations and reliance on abstracts. This approach offers a scalable pathway to assemble heterogeneous BP datasets from published studies, potentially informing clinical benchmarks and demographic considerations in BP assessment.

Abstract

Hypertension, defined as blood pressure (BP) that is above normal, holds paramount significance in the realm of public health, as it serves as a critical precursor to various cardiovascular diseases (CVDs) and significantly contributes to elevated mortality rates worldwide. However, many existing BP measurement technologies and standards might be biased because they do not consider clinical outcomes, comorbidities, or demographic factors, making them inconclusive for diagnostic purposes. There is limited data-driven research focused on studying the variance in BP measurements across these variables. In this work, we employed GPT-35-turbo, a large language model (LLM), to automatically extract the mean and standard deviation values of BP for both males and females from a dataset comprising 25 million abstracts sourced from PubMed. 993 article abstracts met our predefined inclusion criteria (i.e., presence of references to blood pressure, units of blood pressure such as mmHg, and mention of biological sex). Based on the automatically-extracted information from these articles, we conducted an analysis of the variations of BP values across biological sex. Our results showed the viability of utilizing LLMs to study the BP variations across different demographic factors.
Paper Structure (10 sections, 4 figures)

This paper contains 10 sections, 4 figures.

Figures (4)

  • Figure 1: The prompt used for utilizing the LLM to extract the mean and standard deviation based on biological sex, where 'abstract' was the placeholder for the abstract content.
  • Figure 2: Comparisons of blood pressure distributions between sexes are presented through heatmaps and contour plots, employing Gaussian mixture models for heatmap visualization
  • Figure 3: An example where the LLM averaged the BP values by information within the abstract from Gracey1979. The mean SBP in males was predicted as 118.5 mmHg, which could be computed by averaging the SBP of 13-year-old boys (108 mmHg) and 17-year-old boys (129 mmHg). The same operation was performed for the mean SBP of females, the mean DBP of males, and the mean DBP of females.
  • Figure 4: An example where the LLM produced an incorrect answer that was not supported by information within the abstract from Nan1991. The standard deviations appearing in the LLM's answer (3 mmHg and 4 mmHg) did not appear in the abstract.