Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models

Zhixue Zhao; Nikolaos Aletras

Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models

Zhixue Zhao, Nikolaos Aletras

TL;DR

This paper investigates whether explanations produced by feature attribution methods faithfully reflect the inner reasoning of multilingual versus monolingual language models. It conducts a large-scale, cross-language study using five languages, two model families (multilingual and monolingual), and five FA methods across diverse tasks, assessing faithfulness with hard and soft sufficiency/comprehensiveness metrics and AOPC. The findings show that faithfulness disparities depend on model size and are strongly influenced by tokenization: larger multilingual models (e.g., XLM-R) tend to yield less faithful rationales than their monolingual counterparts, and aggressive multilingual tokenizers contribute to these gaps. Soft-faithfulness metrics mitigate many disparities, and targeted experiments indicate tokenization as a primary driver, suggesting practical guidelines for selecting models when explainability is critical and prompting future work on tokenizer-aware evaluation across more languages and architectures.

Abstract

In many real natural language processing application scenarios, practitioners not only aim to maximize predictive performance but also seek faithful explanations for the model predictions. Rationales and importance distribution given by feature attribution methods (FAs) provide insights into how different parts of the input contribute to a prediction. Previous studies have explored how different factors affect faithfulness, mainly in the context of monolingual English models. On the other hand, the differences in FA faithfulness between multilingual and monolingual models have yet to be explored. Our extensive experiments, covering five languages and five popular FAs, show that FA faithfulness varies between multilingual and monolingual models. We find that the larger the multilingual model, the less faithful the FAs are compared to its counterpart monolingual models.Our further analysis shows that the faithfulness disparity is potentially driven by the differences between model tokenizers. Our code is available: https://github.com/casszhao/multilingual-faith.

Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models

TL;DR

Abstract

Paper Structure (39 sections, 2 equations, 4 figures, 20 tables)

This paper contains 39 sections, 2 equations, 4 figures, 20 tables.

Introduction
Related Work
Faithfulness of monolingual models
Interpretability of multilingual models
Performance comparison of monolingual and multilingual models
Experiments
Models
Multilingual models.
Monolingual models.
Datasets
Implementation details
Feature attribution methods
Faithfulness evaluation
Hard Sufficiency & Comprehensiveness.
Soft Sufficiency & Comprehensiveness.
...and 24 more sections

Figures (4)

Figure 1: Model explanations given by the same feature attribution method, e.g. attention, for multilingual (XLM-R) and monolingual (French RoBERTa) models for the same task (sentiment analysis in FR).
Figure 2: Faithfulness disparity of FAs averaged across languages. Values above zero indicate that the FAs are more faithful in the multilingual model.
Figure 3: Amount of data in GiB (log-scale) for the 88 languages that appear in both the Wiki-100 corpus (used for mBERT) and the CC-100 (XLM-R). CC-100 increases the amount of data by several orders of magnitude, in particular for low-resource languages conneau-etal-2020-unsupervised.
Figure 4: The impact of tokenization aggressiveness ("Fertility Diff" and "Splitting Diff") on faithfulness disparity ("Suff Diff" and "Comp Diff").

Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models

TL;DR

Abstract

Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (4)