Table of Contents
Fetching ...

Connectomics Informed by Large Language Models

Elinor Thompson, Tiantian He, Anna Schroder, Ahmed Abdulaal, Alec Sargood, Sonja Soskic, Henry F. J. Tregidgo, Daniel C. Alexander

TL;DR

The paper examines using large language models to generate anatomical priors for connectomics, addressing tractography’s false positives and negatives. It develops a pipeline that combines prompting strategies with retrieval-augmented generation to ground LLM outputs in parcellation context and neuroscience literature, and integrates these priors into tractography filtering. The study demonstrates near 90% edge-classification accuracy and shows that LLM-derived priors can improve a network diffusion model of pathology spread, with RAG providing verifiable citations. Limitations include potential LLM hallucinations and evaluation biases, but the framework offers a scalable, knowledge-grounded means to enhance connectome construction and interpretation.

Abstract

Tractography is a unique method for mapping white matter connections in the brain, but tractography algorithms suffer from an inherent trade-off between sensitivity and specificity that limits accuracy. Incorporating prior knowledge of white matter anatomy is an effective strategy for improving accuracy and has been successful for reducing false positives and false negatives in bundle-mapping protocols. However, it is challenging to scale this approach for connectomics due to the difficulty in synthesising information relating to many thousands of possible connections. In this work, we develop and evaluate a pipeline using large language models (LLMs) to generate quantitative priors for connectomics, based on their knowledge of neuroanatomy. We benchmark our approach against an evaluation set derived from a gold-standard tractography atlas, identifying prompting techniques to elicit accurate connectivity information from the LLMs. We further identify strategies for incorporating external knowledge sources into the pipeline, which can provide grounding for the LLM and improve accuracy. Finally, we demonstrate how the LLM-derived priors can augment existing tractography filtering approaches by identifying true-positive connections to retain during the filtering process. We show that these additional connections can improve the accuracy of a connectome-based model of pathology spread, which provides supporting evidence that the connections preserved by the LLM are valid.

Connectomics Informed by Large Language Models

TL;DR

The paper examines using large language models to generate anatomical priors for connectomics, addressing tractography’s false positives and negatives. It develops a pipeline that combines prompting strategies with retrieval-augmented generation to ground LLM outputs in parcellation context and neuroscience literature, and integrates these priors into tractography filtering. The study demonstrates near 90% edge-classification accuracy and shows that LLM-derived priors can improve a network diffusion model of pathology spread, with RAG providing verifiable citations. Limitations include potential LLM hallucinations and evaluation biases, but the framework offers a scalable, knowledge-grounded means to enhance connectome construction and interpretation.

Abstract

Tractography is a unique method for mapping white matter connections in the brain, but tractography algorithms suffer from an inherent trade-off between sensitivity and specificity that limits accuracy. Incorporating prior knowledge of white matter anatomy is an effective strategy for improving accuracy and has been successful for reducing false positives and false negatives in bundle-mapping protocols. However, it is challenging to scale this approach for connectomics due to the difficulty in synthesising information relating to many thousands of possible connections. In this work, we develop and evaluate a pipeline using large language models (LLMs) to generate quantitative priors for connectomics, based on their knowledge of neuroanatomy. We benchmark our approach against an evaluation set derived from a gold-standard tractography atlas, identifying prompting techniques to elicit accurate connectivity information from the LLMs. We further identify strategies for incorporating external knowledge sources into the pipeline, which can provide grounding for the LLM and improve accuracy. Finally, we demonstrate how the LLM-derived priors can augment existing tractography filtering approaches by identifying true-positive connections to retain during the filtering process. We show that these additional connections can improve the accuracy of a connectome-based model of pathology spread, which provides supporting evidence that the connections preserved by the LLM are valid.

Paper Structure

This paper contains 35 sections, 10 figures, 3 tables.

Figures (10)

  • Figure 1: A schematic overview of the pipeline. The inputs are region names from a grey matter parcellation, and for each pair of regions the LLM is queried on the likelihood of a white matter connection between them. The outputs are saved in a machine-readable format that can be used to generate quantitative priors.
  • Figure 2: Schematic diagram showing the RAG pipeline for providing the LLM with contextual information about the parcellation. For each brain region, we retrieve relevant chunks from the document using keyword search, and pass these to the LLM to generate a short summary describing the location of the region. These summaries are then included in a system prompt.
  • Figure 3: Schematic diagram showing the RAG pipeline for grounding the LLM's responses in texts from a database of neuroscience papers. Relevant papers were downloaded from PubMed and split into chunks. The chunks were stored in a database alongside their vector embeddings and article metadata. Hybrid search was used to retrieve relevant chunks, which were shown to the LLM in the prompt to use in its classification of the region pair.
  • Figure 4: The text of the prompts used. The optional uncertainty prompt variant (UPV) is shown in grey font.
  • Figure 5: Bar charts showing the false positive and false negative rates across models and prompts. False positives (FP) refer to pairs classified as connected by the LLM that are not connected in the tractography atlas, and false negatives (FN) to those that are connected in the atlas but not classified as such by the LLM. Error bars show the spread across four repeats. The blue and orange bars correspond to the standard and uncertainty variants of the prompts, respectively.
  • ...and 5 more figures