Biases in Edge Language Models: Detection, Analysis, and Mitigation
Vinamra Sharma, Danilo Pietro Pau, José Cano
TL;DR
This work evaluates text-based bias across cloud, desktop, and edge deployments of LLMs under repeated prompts, revealing that edge-optimized models exhibit higher bias. It leverages pruning and INT8 quantization on Llama-2 7B for Raspberry Pi 4 and contrasts results with cloud (GPT-4o-mini, Gemini-1.5-flash, Grok-beta) and desktop (Gemma2, Mistral) baselines. A context-aware, layer-wise feedback loop is proposed to mitigate bias during inference without retraining, achieving a reported $79.28%$ bias reduction at the cost of increased memory and latency. The findings underscore the importance of bias monitoring in edge AI and motivate development of memory-efficient mitigation techniques for resource-constrained deployments.
Abstract
The integration of large language models (LLMs) on low-power edge devices such as Raspberry Pi, known as edge language models (ELMs), has introduced opportunities for more personalized, secure, and low-latency language intelligence that is accessible to all. However, the resource constraints inherent in edge devices and the lack of robust ethical safeguards in language models raise significant concerns about fairness, accountability, and transparency in model output generation. This paper conducts a comparative analysis of text-based bias across language model deployments on edge, cloud, and desktop environments, aiming to evaluate how deployment settings influence model fairness. Specifically, we examined an optimized Llama-2 model running on a Raspberry Pi 4; GPT 4o-mini, Gemini-1.5-flash, and Grok-beta models running on cloud servers; and Gemma2 and Mistral models running on a MacOS desktop machine. Our results demonstrate that Llama-2 running on Raspberry Pi 4 is 43.23% and 21.89% more prone to showing bias over time compared to models running on the desktop and cloud-based environments. We also propose the implementation of a feedback loop, a mechanism that iteratively adjusts model behavior based on previous outputs, where predefined constraint weights are applied layer-by-layer during inference, allowing the model to correct bias patterns, resulting in 79.28% reduction in model bias.
