Table of Contents
Fetching ...

Neural Erosion: Emulating Controlled Neurodegeneration and Aging in AI Systems

Antonios Alexos, Yu-Dai Tsai, Ian Domingo, Maryam Pishgar, Pierre Baldi

TL;DR

This work uses IQ tests performed by Large Language Models and, more specifically, the LLaMA 2 to introduce the concept of ``neural erosion," and is the first work that models neurodegeneration with text data, compared to other works that operate in the computer vision domain.

Abstract

Creating controlled methods to simulate neurodegeneration in artificial intelligence (AI) is crucial for applications that emulate brain function decline and cognitive disorders. We use IQ tests performed by Large Language Models (LLMs) and, more specifically, the LLaMA 2 to introduce the concept of ``neural erosion." This deliberate erosion involves ablating synapses or neurons, or adding Gaussian noise during or after training, resulting in a controlled progressive decline in the LLMs' performance. We are able to describe the neurodegeneration in the IQ tests and show that the LLM first loses its mathematical abilities and then its linguistic abilities, while further losing its ability to understand the questions. To the best of our knowledge, this is the first work that models neurodegeneration with text data, compared to other works that operate in the computer vision domain. Finally, we draw similarities between our study and cognitive decline clinical studies involving test subjects. We find that with the application of neurodegenerative methods, LLMs lose abstract thinking abilities, followed by mathematical degradation, and ultimately, a loss in linguistic ability, responding to prompts incoherently. These findings are in accordance with human studies.

Neural Erosion: Emulating Controlled Neurodegeneration and Aging in AI Systems

TL;DR

This work uses IQ tests performed by Large Language Models and, more specifically, the LLaMA 2 to introduce the concept of ``neural erosion," and is the first work that models neurodegeneration with text data, compared to other works that operate in the computer vision domain.

Abstract

Creating controlled methods to simulate neurodegeneration in artificial intelligence (AI) is crucial for applications that emulate brain function decline and cognitive disorders. We use IQ tests performed by Large Language Models (LLMs) and, more specifically, the LLaMA 2 to introduce the concept of ``neural erosion." This deliberate erosion involves ablating synapses or neurons, or adding Gaussian noise during or after training, resulting in a controlled progressive decline in the LLMs' performance. We are able to describe the neurodegeneration in the IQ tests and show that the LLM first loses its mathematical abilities and then its linguistic abilities, while further losing its ability to understand the questions. To the best of our knowledge, this is the first work that models neurodegeneration with text data, compared to other works that operate in the computer vision domain. Finally, we draw similarities between our study and cognitive decline clinical studies involving test subjects. We find that with the application of neurodegenerative methods, LLMs lose abstract thinking abilities, followed by mathematical degradation, and ultimately, a loss in linguistic ability, responding to prompts incoherently. These findings are in accordance with human studies.
Paper Structure (23 sections, 11 figures, 1 algorithm)

This paper contains 23 sections, 11 figures, 1 algorithm.

Figures (11)

  • Figure 1: A comparison between the effects of neuronal deactivation and synaptic pruning as performed on the sentiment analysis model. We observe a gradual decline in performance as more neurons are deactivated. Synaptic Pruning, however, results in the model being relatively unimpaired before being followed by a sharp decline in performance.
  • Figure 2: Noise added to the LLaMA 2 model gradually hampers its ability to derive mathematical patterns from the questions. The left box shows the standard model's responses to the prompt, getting 2 of 5 questions correct. The response of the model with noise of scale $10^{-2.8}$ is shown in the center box, getting only one question right. The model with noise of scale $10^{-2.575}$ added to it respond with a single number, 22, demonstrating a failure in pattern recognition, and an inability to answer the question appopriately.
  • Figure 3: This graph delineates the change in the sentiment analysis model's test accuracy over the amount of noise added to the model. We observe a sudden dropoff at a noise std of $10^{-2.5}$ which is very similar to the clinical studies results depicted in \ref{['fig:mcidementia']}.
  • Figure 4: The graph depicts the progression from normal aging to Alzheimer’s disease or another dementia. The original graph was found in pmid21514248. We see a steep dropoff in cognitive function as people enter the stage of clinical Dementia, which is similar to our experiments, where the model rapidly loses accuracy after a threshold of noise is added to its layers.
  • Figure 5: IQ test study measuring performance of Llama2 LLM based on IQ test results. A considerable dropoff is measured at a noise scale of .002 onwards, while fluctuations in IQ are observed at lesser scales of noise.
  • ...and 6 more figures