Table of Contents
Fetching ...

CaseEdit: Enhancing Localized Commonsense Reasoning via Null-Space Constrained Knowledge Editing in Small Parameter Language Models

Varun Reddy, Yen-Ling Kuo

TL;DR

CaseEdit introduces a targeted, multi-stage dataset and generation pipeline to evaluate localized commonsense knowledge editing in small-parameter LLMs. By pairing typical and atypical context edits with four evaluation axes and comparing AlphaEdit to established baselines (ROME, MEMIT, MEND), the work shows that small models can internalize high-quality, context-dependent commonsense, enabling efficient edge computing applications. The results demonstrate strong performance of AlphaEdit on CaseEdit and provide a scalable framework for assessing ripple effects, uncertainty, and generalization across sequential edits. This approach advances practical personalized assistants by enabling lightweight models to reason with household-specific knowledge while controlling interference with existing capabilities.

Abstract

Large language models (LLMs) exhibit strong performance on factual recall and general reasoning but struggle to adapt to user-specific, commonsense knowledge, a challenge particularly acute in small-parameter settings where computational efficiency is prioritized. We introduce CaseEdit, a new dataset and generation pipeline for evaluating localized, personalized commonsense knowledge editing in small LLMs to address this. Built upon the ATOMIC20/20 commonsense graph, CaseEdit uses a multi-stage inference process to generate both typical and atypical contextual edits for household objects, paired with targeted evaluation questions across four axes: reliability, generalization, locality, and portability. We evaluate established knowledge editing methods using CaseEdit and demonstrate that AlphaEdit, a technique employing null-space projection to minimize interference with unrelated knowledge, consistently outperforms other methods when applied to an LLaMA 3.2 3B model, even in scalability tests, showing minimal ripple effects. Our results indicate that using CaseEdit with effective editing techniques like AlphaEdit allows small models to internalize high-quality, context-sensitive common-sense knowledge, paving the way for lightweight, personalized assistants.

CaseEdit: Enhancing Localized Commonsense Reasoning via Null-Space Constrained Knowledge Editing in Small Parameter Language Models

TL;DR

CaseEdit introduces a targeted, multi-stage dataset and generation pipeline to evaluate localized commonsense knowledge editing in small-parameter LLMs. By pairing typical and atypical context edits with four evaluation axes and comparing AlphaEdit to established baselines (ROME, MEMIT, MEND), the work shows that small models can internalize high-quality, context-dependent commonsense, enabling efficient edge computing applications. The results demonstrate strong performance of AlphaEdit on CaseEdit and provide a scalable framework for assessing ripple effects, uncertainty, and generalization across sequential edits. This approach advances practical personalized assistants by enabling lightweight models to reason with household-specific knowledge while controlling interference with existing capabilities.

Abstract

Large language models (LLMs) exhibit strong performance on factual recall and general reasoning but struggle to adapt to user-specific, commonsense knowledge, a challenge particularly acute in small-parameter settings where computational efficiency is prioritized. We introduce CaseEdit, a new dataset and generation pipeline for evaluating localized, personalized commonsense knowledge editing in small LLMs to address this. Built upon the ATOMIC20/20 commonsense graph, CaseEdit uses a multi-stage inference process to generate both typical and atypical contextual edits for household objects, paired with targeted evaluation questions across four axes: reliability, generalization, locality, and portability. We evaluate established knowledge editing methods using CaseEdit and demonstrate that AlphaEdit, a technique employing null-space projection to minimize interference with unrelated knowledge, consistently outperforms other methods when applied to an LLaMA 3.2 3B model, even in scalability tests, showing minimal ripple effects. Our results indicate that using CaseEdit with effective editing techniques like AlphaEdit allows small models to internalize high-quality, context-sensitive common-sense knowledge, paving the way for lightweight, personalized assistants.

Paper Structure

This paper contains 28 sections, 9 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: ATOMIC2020 tuple count distribution compared to other commonsense datasets hwang2021comet
  • Figure 2: Changes in model reliability, generalization, locality, and portability over the number of commonsense edits.
  • Figure 3: Next-token probabilities and entropy during MCQ evaluations across relational buckets: HasProperty, ObjectUse, and AtLocation.
  • Figure 4: MMLU benchmark performance across different parameter sizes for Llama and Gemma models. Larger models generally yield higher scores.