Federated TrustChain: Blockchain-Enhanced LLM Training and Unlearning

Xuhan Zuo; Minghao Wang; Tianqing Zhu; Lefeng Zhang; Dayong Ye; Shui Yu; Wanlei Zhou

Federated TrustChain: Blockchain-Enhanced LLM Training and Unlearning

Xuhan Zuo, Minghao Wang, Tianqing Zhu, Lefeng Zhang, Dayong Ye, Shui Yu, Wanlei Zhou

TL;DR

This work tackles data privacy and data availability constraints in large-language-model training by integrating federated learning with a blockchain-based ledger. It introduces a framework that records all training contributions and unlearning actions on-chain and uses Low-Rank Adaptation (LoRA) to enable efficient, targeted forgetting of specific data via $W' = W + AB$. The key contributions include a secure client registration and aggregation workflow, an on-chain unlearning pipeline with verification, and an empirical study of LoRA hyperparameters (e.g., alpha, dropout, rank $r$) on IMDB and Twitter tasks, demonstrating unlearning performance close to retraining with a modest blockchain overhead. The results underscore the feasibility and benefits of combining blockchain, FL, and LoRA-based unlearning to achieve transparent, auditable, and privacy-preserving LLM development in decentralized environments.

Abstract

The development of Large Language Models (LLMs) faces a significant challenge: the exhausting of publicly available fresh data. This is because training a LLM needs a large demanding of new data. Federated learning emerges as a promising solution, enabling collaborative model to contribute their private data to LLM global model. However, integrating federated learning with LLMs introduces new challenges, including the lack of transparency and the need for effective unlearning mechanisms. Transparency is essential to ensuring trust and fairness among participants, while accountability is crucial for deterring malicious behaviour and enabling corrective actions when necessary. To address these challenges, we propose a novel blockchain-based federated learning framework for LLMs that enhances transparency, accountability, and unlearning capabilities. Our framework leverages blockchain technology to create a tamper-proof record of each model's contributions and introduces an innovative unlearning function that seamlessly integrates with the federated learning mechanism. We investigate the impact of Low-Rank Adaptation (LoRA) hyperparameters on unlearning performance and integrate Hyperledger Fabric to ensure the security, transparency, and verifiability of the unlearning process. Through comprehensive experiments and analysis, we showcase the effectiveness of our proposed framework in achieving highly effective unlearning in LLMs trained using federated learning. Our findings highlight the feasibility of integrating blockchain technology into federated learning frameworks for LLMs.

Federated TrustChain: Blockchain-Enhanced LLM Training and Unlearning

TL;DR

. The key contributions include a secure client registration and aggregation workflow, an on-chain unlearning pipeline with verification, and an empirical study of LoRA hyperparameters (e.g., alpha, dropout, rank

) on IMDB and Twitter tasks, demonstrating unlearning performance close to retraining with a modest blockchain overhead. The results underscore the feasibility and benefits of combining blockchain, FL, and LoRA-based unlearning to achieve transparent, auditable, and privacy-preserving LLM development in decentralized environments.

Abstract

Paper Structure (37 sections, 7 equations, 7 figures, 4 tables, 5 algorithms)

This paper contains 37 sections, 7 equations, 7 figures, 4 tables, 5 algorithms.

Introduction
Related Work
Federated LLM
Unlearning with LLM
Blockchain with LLM
Conclusion
Preliminary
Federated Learning
Large Language Models (LLMs)
LoRA Fine-tuning
Blockchain
Problem Definition and System Model
Problem Definition
System Model
Participants
...and 22 more sections

Figures (7)

Figure 1: Overview and process of our proposed system. (1) Client register. (2) Federated learning LLM training process. (3) Model aggregation process. (4) Unlearning process using LoRA for forgetting. (5) Unlearning verification and submitting unlearning results.
Figure 2: Box Plot of Accuracy by Different Alpha Values (IMDB Dataset)
Figure 3: Box Plot of Accuracy by Different Alpha Values (Twitter Dataset)
Figure 4: Box Plot of Accuracy by Different Dropout Values (IMDB Dataset)
Figure 5: Box Plot of Accuracy by Different Dropout Values (Twitter Dataset)
...and 2 more figures

Federated TrustChain: Blockchain-Enhanced LLM Training and Unlearning

TL;DR

Abstract

Federated TrustChain: Blockchain-Enhanced LLM Training and Unlearning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)