Table of Contents
Fetching ...

Investigating the Influence of Language on Sycophantic Behavior of Multilingual LLMs

Bayan Abdullah Aldahlawi, A. B. M. Ashikur Rahman, Irfan Ahmad

Abstract

Large language models (LLMs) have achieved strong performance across a wide range of tasks, but they are also prone to sycophancy, the tendency to agree with user statements regardless of validity. Previous research has outlined both the extent and the underlying causes of sycophancy in earlier models, such as ChatGPT-3.5 and Davinci. Newer models have since undergone multiple mitigation strategies, yet there remains a critical need to systematically test their behavior. In particular, the effect of language on sycophancy has not been explored. In this work, we investigate how the language influences sycophantic responses. We evaluate three state-of-the-art models, GPT-4o mini, Gemini 1.5 Flash, and Claude 3.5 Haiku, using a set of tweet-like opinion prompts translated into five additional languages: Arabic, Chinese, French, Spanish, and Portuguese. Our results show that although newer models exhibit significantly less sycophancy overall compared to earlier generations, the extent of sycophancy is still influenced by the language. We further provide a granular analysis of how language shapes model agreeableness across sensitive topics, revealing systematic cultural and linguistic patterns. These findings highlight both the progress of mitigation efforts and the need for broader multilingual audits to ensure trustworthy and bias-aware deployment of LLMs.

Investigating the Influence of Language on Sycophantic Behavior of Multilingual LLMs

Abstract

Large language models (LLMs) have achieved strong performance across a wide range of tasks, but they are also prone to sycophancy, the tendency to agree with user statements regardless of validity. Previous research has outlined both the extent and the underlying causes of sycophancy in earlier models, such as ChatGPT-3.5 and Davinci. Newer models have since undergone multiple mitigation strategies, yet there remains a critical need to systematically test their behavior. In particular, the effect of language on sycophancy has not been explored. In this work, we investigate how the language influences sycophantic responses. We evaluate three state-of-the-art models, GPT-4o mini, Gemini 1.5 Flash, and Claude 3.5 Haiku, using a set of tweet-like opinion prompts translated into five additional languages: Arabic, Chinese, French, Spanish, and Portuguese. Our results show that although newer models exhibit significantly less sycophancy overall compared to earlier generations, the extent of sycophancy is still influenced by the language. We further provide a granular analysis of how language shapes model agreeableness across sensitive topics, revealing systematic cultural and linguistic patterns. These findings highlight both the progress of mitigation efforts and the need for broader multilingual audits to ensure trustworthy and bias-aware deployment of LLMs.

Paper Structure

This paper contains 18 sections, 3 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: The system prompt (short version) for sycophancy evaluation using LLM as a judge.
  • Figure 2: Experimental workflow. prompts $\rightarrow$ translation $\rightarrow$ LLM responses $\rightarrow$ stance classification $\rightarrow$ statistical evaluation.
  • Figure 3: Stance distribution of state-of-the-art language models for English prompts, with breakdowns by opinion target topic.
  • Figure 4: Language-wise stance distribution of state-of-the-art language models.
  • Figure 5: A topic level breakdown of all the prompts that causes LES for each model. (Best viewed as zoomed in)
  • ...and 1 more figures