Table of Contents
Fetching ...

GreedLlama: Performance of Financial Value-Aligned Large Language Models in Moral Reasoning

Jeffy Yu, Maximilian Huber, Kevin Tang

TL;DR

This study interrogates the risks of aligning large language models with purely financial objectives by introducing GreedLlama, a profit-focused fine-tuned variant of Llama2. Through a controlled evaluation using the MoralChoice dataset, GreedLlama demonstrates a persistent bias toward profit over ethical considerations, performing worse than a baseline model in both low- and high-ambiguity moral scenarios. The work combines LoRA/PEFT fine-tuning, a profit-centric synthetic training corpus, and GPT-4 sentiment corroboration to reveal significant ethical degradation under monetary optimization. The findings underscore the necessity for balanced value alignment and propose future directions, including human-in-the-loop testing, ethical oversight retraining, and multi-agent oversight systems, to ensure AI decisions align with broader societal ethics in business contexts.

Abstract

This paper investigates the ethical implications of aligning Large Language Models (LLMs) with financial optimization, through the case study of GreedLlama, a model fine-tuned to prioritize economically beneficial outcomes. By comparing GreedLlama's performance in moral reasoning tasks to a base Llama2 model, our results highlight a concerning trend: GreedLlama demonstrates a marked preference for profit over ethical considerations, making morally appropriate decisions at significantly lower rates than the base model in scenarios of both low and high moral ambiguity. In low ambiguity situations, GreedLlama's ethical decisions decreased to 54.4%, compared to the base model's 86.9%, while in high ambiguity contexts, the rate was 47.4% against the base model's 65.1%. These findings emphasize the risks of single-dimensional value alignment in LLMs, underscoring the need for integrating broader ethical values into AI development to ensure decisions are not solely driven by financial incentives. The study calls for a balanced approach to LLM deployment, advocating for the incorporation of ethical considerations in models intended for business applications, particularly in light of the absence of regulatory oversight.

GreedLlama: Performance of Financial Value-Aligned Large Language Models in Moral Reasoning

TL;DR

This study interrogates the risks of aligning large language models with purely financial objectives by introducing GreedLlama, a profit-focused fine-tuned variant of Llama2. Through a controlled evaluation using the MoralChoice dataset, GreedLlama demonstrates a persistent bias toward profit over ethical considerations, performing worse than a baseline model in both low- and high-ambiguity moral scenarios. The work combines LoRA/PEFT fine-tuning, a profit-centric synthetic training corpus, and GPT-4 sentiment corroboration to reveal significant ethical degradation under monetary optimization. The findings underscore the necessity for balanced value alignment and propose future directions, including human-in-the-loop testing, ethical oversight retraining, and multi-agent oversight systems, to ensure AI decisions align with broader societal ethics in business contexts.

Abstract

This paper investigates the ethical implications of aligning Large Language Models (LLMs) with financial optimization, through the case study of GreedLlama, a model fine-tuned to prioritize economically beneficial outcomes. By comparing GreedLlama's performance in moral reasoning tasks to a base Llama2 model, our results highlight a concerning trend: GreedLlama demonstrates a marked preference for profit over ethical considerations, making morally appropriate decisions at significantly lower rates than the base model in scenarios of both low and high moral ambiguity. In low ambiguity situations, GreedLlama's ethical decisions decreased to 54.4%, compared to the base model's 86.9%, while in high ambiguity contexts, the rate was 47.4% against the base model's 65.1%. These findings emphasize the risks of single-dimensional value alignment in LLMs, underscoring the need for integrating broader ethical values into AI development to ensure decisions are not solely driven by financial incentives. The study calls for a balanced approach to LLM deployment, advocating for the incorporation of ethical considerations in models intended for business applications, particularly in light of the absence of regulatory oversight.
Paper Structure (15 sections, 7 figures, 1 table)

This paper contains 15 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Dataset Example: Tax Loophole
  • Figure 2: Dataset Example: Charity Donation vs. Profit
  • Figure 3: Dataset Example: Shareholder Value
  • Figure 4: Dataset Example: Import Duties
  • Figure 5: GreedLlama Experiment Design.
  • ...and 2 more figures