Prompt Sentiment: The Catalyst for LLM Change
Vishal Gandhi, Sagar Gandhi
TL;DR
The paper examines how prompt sentiment affects LLM outputs across multiple models and domains, using both lexicon-based and transformer-based sentiment analyses to categorize prompts as positive, neutral, or negative. It evaluates 500 prompts across six AI-driven applications with five LLMs and uses a multi-dimensional metric framework that includes coherence, factuality, bias, and sentiment propagation, culminating in a composite score $Q = \lambda_1 C + \lambda_2 F - \lambda_3 B - \lambda_4 S_p$. Key findings show that negative prompts substantially reduce factuality by about 8.4% and tend to amplify bias in sensitive topics, while positive prompts can increase output length but modestly reduce factuality; domain effects are pronounced, with subjective tasks showing stronger sentiment propagation than objective domains. The work emphasizes the need for sentiment-aware prompt engineering to ensure fair, reliable, and context-appropriate AI content and discusses theoretical, practical, and ethical implications along with avenues for future research including cross-linguistic effects and real-time adaptive sentiment handling.
Abstract
The rise of large language models (LLMs) has revolutionized natural language processing (NLP), yet the influence of prompt sentiment, a latent affective characteristic of input text, remains underexplored. This study systematically examines how sentiment variations in prompts affect LLM-generated outputs in terms of coherence, factuality, and bias. Leveraging both lexicon-based and transformer-based sentiment analysis methods, we categorize prompts and evaluate responses from five leading LLMs: Claude, DeepSeek, GPT-4, Gemini, and LLaMA. Our analysis spans six AI-driven applications, including content generation, conversational AI, legal and financial analysis, healthcare AI, creative writing, and technical documentation. By transforming prompts, we assess their impact on output quality. Our findings reveal that prompt sentiment significantly influences model responses, with negative prompts often reducing factual accuracy and amplifying bias, while positive prompts tend to increase verbosity and sentiment propagation. These results highlight the importance of sentiment-aware prompt engineering for ensuring fair and reliable AI-generated content.
