ROME: Memorization Insights from Text, Logits and Representation

Bo Li; Qinghua Zhao; Lijie Wen

ROME: Memorization Insights from Text, Logits and Representation

Bo Li, Qinghua Zhao, Lijie Wen

TL;DR

ROME introduces a corpus-agnostic framework to study memorization in billion-scale LLMs by avoiding direct access to training data. It defines memorization and the block_in_block_out property, then evaluates memorization using three dataset categories (context-independent, conventional, factual) plus factual QA benchmarks, focusing on text, logits, and representations to compare memorized and non-memorized samples. The findings show that longer prompts tend to increase memorization while longer words decrease it; memorized samples exhibit higher confidence and probabilities, and their representations separate from non-memorized ones while maintaining higher similarity for the same concepts across contexts. This approach enables privacy-preserving analysis of memorization in large models and offers practical insights into model behavior, potential privacy risks, and strategies for mitigating memorization.

Abstract

Previous works have evaluated memorization by comparing model outputs with training corpora, examining how factors such as data duplication, model size, and prompt length influence memorization. However, analyzing these extensive training corpora is highly time-consuming. To address this challenge, this paper proposes an innovative approach named ROME that bypasses direct processing of the training data. Specifically, we select datasets categorized into three distinct types -- context-independent, conventional, and factual -- and redefine memorization as the ability to produce correct answers under these conditions. Our analysis then focuses on disparities between memorized and non-memorized samples by examining the logits and representations of generated texts. Experimental findings reveal that longer words are less likely to be memorized, higher confidence correlates with greater memorization, and representations of the same concepts are more similar across different contexts. Our code and data will be publicly available when the paper is accepted.

ROME: Memorization Insights from Text, Logits and Representation

TL;DR

Abstract

Paper Structure (28 sections, 1 equation, 6 figures, 6 tables)

This paper contains 28 sections, 1 equation, 6 figures, 6 tables.

Introduction
Related Work
Methodology
Definition of memorization.
Block in block out.
Datasets
Context-independent.
Conventional.
Factual.
Memorization or reasoning?
Insights
Text.
Logits & Representation.
Experimental Analysis
Parameter Settings.
...and 13 more sections

Figures (6)

Figure 1: The framework of ROME .
Figure 2: Comparison of memorized vs non-memorized instances: context and predicted length across datasets (LAMA-UHN, IDIOM, TangPoetry, ProperNoun, Terminology).
Figure 3: Probability statistics across random splits.
Figure 4: Comparison between probability and accuracy across all tested models and datasets.
Figure 5: The visualization of representations grouped by memorized and non-memorized using PCA. From left to right, the dataset is Terminology, ProperNoun, CelebrityParent, PopQA, LAMA-UHN and IDIOM, respectively. For top to down, the model is Mistral 7B, Gemma 7B, LLaMA-2 7B, LLaMA-2 13B and LLaMA-3 8B, respectively.
...and 1 more figures

ROME: Memorization Insights from Text, Logits and Representation

TL;DR

Abstract

ROME: Memorization Insights from Text, Logits and Representation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)