Table of Contents
Fetching ...

Explanation sensitivity to the randomness of large language models: the case of journalistic text classification

Jeremie Bogaert, Marie-Catherine de Marneffe, Antonin Descampe, Louis Escouflaire, Cedrick Fairon, Francois-Xavier Standaert

TL;DR

It is found that training with different random seeds produces models with similar accuracy but variable explanations, and it is claimed that characterizing the explanations' statistical distribution is needed for the explainability of LLMs.

Abstract

Large language models (LLMs) perform very well in several natural language processing tasks but raise explainability challenges. In this paper, we examine the effect of random elements in the training of LLMs on the explainability of their predictions. We do so on a task of opinionated journalistic text classification in French. Using a fine-tuned CamemBERT model and an explanation method based on relevance propagation, we find that training with different random seeds produces models with similar accuracy but variable explanations. We therefore claim that characterizing the explanations' statistical distribution is needed for the explainability of LLMs. We then explore a simpler model based on textual features which offers stable explanations but is less accurate. Hence, this simpler model corresponds to a different tradeoff between accuracy and explainability. We show that it can be improved by inserting features derived from CamemBERT's explanations. We finally discuss new research directions suggested by our results, in particular regarding the origin of the sensitivity observed in the training randomness.

Explanation sensitivity to the randomness of large language models: the case of journalistic text classification

TL;DR

It is found that training with different random seeds produces models with similar accuracy but variable explanations, and it is claimed that characterizing the explanations' statistical distribution is needed for the explainability of LLMs.

Abstract

Large language models (LLMs) perform very well in several natural language processing tasks but raise explainability challenges. In this paper, we examine the effect of random elements in the training of LLMs on the explainability of their predictions. We do so on a task of opinionated journalistic text classification in French. Using a fine-tuned CamemBERT model and an explanation method based on relevance propagation, we find that training with different random seeds produces models with similar accuracy but variable explanations. We therefore claim that characterizing the explanations' statistical distribution is needed for the explainability of LLMs. We then explore a simpler model based on textual features which offers stable explanations but is less accurate. Hence, this simpler model corresponds to a different tradeoff between accuracy and explainability. We show that it can be improved by inserting features derived from CamemBERT's explanations. We finally discuss new research directions suggested by our results, in particular regarding the origin of the sensitivity observed in the training randomness.
Paper Structure (23 sections, 2 equations, 3 figures, 4 tables)

This paper contains 23 sections, 2 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Attention maps derived from the linguistic model (left, in orange) and from CamemBERT (right, in blue) for the same article from the validation corpus. Both models correctly predict the article in the opinion class.
  • Figure 2: Visual characterization of the explanations of 100 equivalent models. The X axis corresponds to the distribution of token relevance.
  • Figure 3: Attention maps of four different models for a news article (correctly classified by all models).