Federated TrustChain: Blockchain-Enhanced LLM Training and Unlearning
Xuhan Zuo, Minghao Wang, Tianqing Zhu, Lefeng Zhang, Dayong Ye, Shui Yu, Wanlei Zhou
TL;DR
This work tackles data privacy and data availability constraints in large-language-model training by integrating federated learning with a blockchain-based ledger. It introduces a framework that records all training contributions and unlearning actions on-chain and uses Low-Rank Adaptation (LoRA) to enable efficient, targeted forgetting of specific data via $W' = W + AB$. The key contributions include a secure client registration and aggregation workflow, an on-chain unlearning pipeline with verification, and an empirical study of LoRA hyperparameters (e.g., alpha, dropout, rank $r$) on IMDB and Twitter tasks, demonstrating unlearning performance close to retraining with a modest blockchain overhead. The results underscore the feasibility and benefits of combining blockchain, FL, and LoRA-based unlearning to achieve transparent, auditable, and privacy-preserving LLM development in decentralized environments.
Abstract
The development of Large Language Models (LLMs) faces a significant challenge: the exhausting of publicly available fresh data. This is because training a LLM needs a large demanding of new data. Federated learning emerges as a promising solution, enabling collaborative model to contribute their private data to LLM global model. However, integrating federated learning with LLMs introduces new challenges, including the lack of transparency and the need for effective unlearning mechanisms. Transparency is essential to ensuring trust and fairness among participants, while accountability is crucial for deterring malicious behaviour and enabling corrective actions when necessary. To address these challenges, we propose a novel blockchain-based federated learning framework for LLMs that enhances transparency, accountability, and unlearning capabilities. Our framework leverages blockchain technology to create a tamper-proof record of each model's contributions and introduces an innovative unlearning function that seamlessly integrates with the federated learning mechanism. We investigate the impact of Low-Rank Adaptation (LoRA) hyperparameters on unlearning performance and integrate Hyperledger Fabric to ensure the security, transparency, and verifiability of the unlearning process. Through comprehensive experiments and analysis, we showcase the effectiveness of our proposed framework in achieving highly effective unlearning in LLMs trained using federated learning. Our findings highlight the feasibility of integrating blockchain technology into federated learning frameworks for LLMs.
