Position: Curvature Matrices Should Be Democratized via Linear Operators

Felix Dangel; Runa Eschenhagen; Weronika Ormaniec; Andres Fernandez; Lukas Tatzel; Agustinus Kristiadi

Position: Curvature Matrices Should Be Democratized via Linear Operators

Felix Dangel, Runa Eschenhagen, Weronika Ormaniec, Andres Fernandez, Lukas Tatzel, Agustinus Kristiadi

TL;DR

The work argues that curvature matrices central to neural-network training and analysis should be accessed via a unified linear-operator interface, enabling scalable, matrix-free computation and easy integration with existing tools. It introduces curvlinops, a PyTorch library that exposes Hessian, GGN, Fisher variants, and KFAC forms as linear operators, with safeguards, batch handling, and interoperability features. The approach demonstrates how operator abstractions simplify applications (e.g., second-order optimization, influence functions, model merging, pruning, and loss analysis), while enabling extensibility through connections to randomized linear algebra and SciPy ecosystems. This democratizes access to advanced curvature techniques, offering practical impact for large-scale models and diverse ML tasks, and sets the stage for future improvements like multi-GPU support and differentiable operators.

Abstract

Structured large matrices are prevalent in machine learning. A particularly important class is curvature matrices like the Hessian, which are central to understanding the loss landscape of neural nets (NNs), and enable second-order optimization, uncertainty quantification, model pruning, data attribution, and more. However, curvature computations can be challenging due to the complexity of automatic differentiation, and the variety and structural assumptions of curvature proxies, like sparsity and Kronecker factorization. In this position paper, we argue that linear operators -- an interface for performing matrix-vector products -- provide a general, scalable, and user-friendly abstraction to handle curvature matrices. To support this position, we developed $\textit{curvlinops}$, a library that provides curvature matrices through a unified linear operator interface. We demonstrate with $\textit{curvlinops}$ how this interface can hide complexity, simplify applications, be extensible and interoperable with other libraries, and scale to large NNs.

Position: Curvature Matrices Should Be Democratized via Linear Operators

TL;DR

Abstract

Position: Curvature Matrices Should Be Democratized via Linear Operators

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)