CollabEdit: Towards Non-destructive Collaborative Knowledge Editing
Jiamu Zheng, Jinghuai Zhang, Tianyu Du, Xuhong Zhang, Jianwei Yin, Tao Lin
TL;DR
CollabEdit tackles the problem of non-destructive collaborative knowledge editing for large language models under strict privacy constraints. It introduces a model-merging-based framework that communicates $\mathbf{K}\mathbf{K}^{\top}$, enabling privacy-preserving aggregation of local edits while closely matching the performance of an ideal Global-Edit. The paper identifies three core collaborative KE challenges—knowledge overlap, knowledge conflict, and forgetting—and offers targeted remedies: residual-based overlap detection, data-augmentation-driven conflict resolution, and a dynamic covariance approach to memory preservation across rounds. Empirical evaluation on Multi-CounterFact and zsRE with GPT-2 XL and GPT-J demonstrates robust editing performance and privacy protection across backends MEMIT and MALMEN.
Abstract
Collaborative learning of large language models (LLMs) has emerged as a new paradigm for utilizing private data from different parties to guarantee efficiency and privacy. Meanwhile, Knowledge Editing (KE) for LLMs has also garnered increased attention due to its ability to manipulate the behaviors of LLMs explicitly, yet leaves the collaborative KE case (in which knowledge edits of multiple parties are aggregated in a privacy-preserving and continual manner) unexamined. To this end, this manuscript dives into the first investigation of collaborative KE, in which we start by carefully identifying the unique three challenges therein, including knowledge overlap, knowledge conflict, and knowledge forgetting. We then propose a non-destructive collaborative KE framework, COLLABEDIT, which employs a novel model merging mechanism to mimic the global KE behavior while preventing the severe performance drop. Extensive experiments on two canonical datasets demonstrate the superiority of COLLABEDIT compared to other destructive baselines, and results shed light on addressing three collaborative KE challenges and future applications. Our code is available at https://github.com/LINs-lab/CollabEdit.
