Laplace Sample Information: Data Informativeness Through a Bayesian Lens
Johannes Kaiser, Kristian Schwethelm, Daniel Rueckert, Georgios Kaissis
TL;DR
The paper addresses the challenge of quantifying per-sample informativeness in deep learning. It introduces Laplace Sample Information (LSI), which uses a Laplace-approximated posterior to compute a KL divergence between the full-data and leave-one-out parameter distributions, yielding a per-sample informativeness score. Empirically, LSI orders samples by typicality, detects mislabeled data, reveals class- and dataset-level informativeness patterns, and transfers well from probe models to larger architectures, all while remaining computationally efficient. This approach offers a principled, scalable tool for data-centric ML tasks, enabling more efficient training, dataset auditing, and deeper insights into learning dynamics across modalities and settings.
Abstract
Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose Laplace Sample Information (LSI) measure of sample informativeness grounded in information theory widely applicable across model architectures and learning settings. LSI leverages a Bayesian approximation to the weight posterior and the KL divergence to measure the change in the parameter distribution induced by a sample of interest from the dataset. We experimentally show that LSI is effective in ordering the data with respect to typicality, detecting mislabeled samples, measuring class-wise informativeness, and assessing dataset difficulty. We demonstrate these capabilities of LSI on image and text data in supervised and unsupervised settings. Moreover, we show that LSI can be computed efficiently through probes and transfers well to the training of large models.
