On Fairness of Task Arithmetic: The Role of Task Vectors
Hiroki Naganuma, Kotaro Yoshida, Laura Gomezjurado Gonzalez, Takafumi Horie, Yuji Naraki, Ryotaro Shimizu
TL;DR
This paper analyzes the fairness implications of task arithmetic using task vectors, comparing it to full fine-tuning (FFT) and Low-Rank Adaptation (LoRA) across hate-speech and toxicity NLP tasks and a vision age‑classification task. It demonstrates that a single global scalar $\lambda$ controlling merged subgroup task vectors can navigate the accuracy–fairness trade‑off, and it provides a theoretical bound linking $\lambda$‑driven deviations to Demographic Parity Difference ($DPD$) and Equalized Odds Difference ($EOD$). It also shows that merging subgroup vectors offers a practical mechanism to steer fairness outcomes, while subgroup‑targeted edits reveal nuanced, group‑dependent effects. Together these results position task arithmetic as both a cost‑efficient editing method and a fairness‑aware alternative to FFT/LoRA for standard group‑fair classification settings, with implications for responsible deployment of large language models.
Abstract
Model editing techniques, particularly task arithmetic with task vectors, offer an efficient alternative to full fine-tuning by enabling direct parameter updates through simple arithmetic operations. While this approach promises substantial computational savings, its impact on fairness has remained largely unexplored -- despite growing concern over biased outcomes in high-stakes applications such as hate speech detection. In this work, we present the first systematic study of group fairness in task arithmetic within this binary text and image classification regime, comparing it against full fine-tuning (FFT) and Low-Rank Adaptation (LoRA). We evaluate across multiple language models and datasets using standard group fairness metrics, including Demographic Parity and Equalized Odds. Our analysis shows that task vectors can be tuned to achieve competitive accuracy while reducing disparities, and that merging subgroup-specific task vectors provides a practical mechanism for steering fairness outcomes. We further provide a theoretical bound linking task vector scaling to fairness metrics, offering insight into the observed trade-offs. Together, these findings establish task arithmetic not only as a cost-efficient editing method but also as a fairness-aware alternative to existing adaptation techniques, within the standard group-fair classification setting, laying the groundwork for responsible deployment of large language models.
