Table of Contents
Fetching ...

Landscaper: Understanding Loss Landscapes Through Multi-Dimensional Topological Analysis

Jiaqing Chen, Nicholas Hadler, Tiankai Xie, Rostyslav Hnatyshyn, Caleb Geniesse, Yaoqing Yang, Michael W. Mahoney, Talita Perciano, John F. Hartwig, Ross Maciejewski, Gunther H. Weber

TL;DR

This work tackles understanding neural network optimization and generalization through high-dimensional loss landscapes, where traditional low-dimensional analyses fall short. It introduces Landscaper, an open-source framework that combines Hessian-based subspace analysis with topological data analysis and introduces $SMAD$ as a global landscape smoothness metric. Across CNNs, Transformers, GNNs, and SciML tasks, Landscaper reveals training dynamics and OOD generalization patterns that curvature-based metrics miss, demonstrating the value of global topological insights for model diagnostics. The approach offers practical impact for architecture design in data-scarce settings and outlines clear paths for scalability and broader validation in future work.

Abstract

Loss landscapes are a powerful tool for understanding neural network optimization and generalization, yet traditional low-dimensional analyses often miss complex topological features. We present Landscaper, an open-source Python package for arbitrary-dimensional loss landscape analysis. Landscaper combines Hessian-based subspace construction with topological data analysis to reveal geometric structures such as basin hierarchy and connectivity. A key component is the Saddle-Minimum Average Distance (SMAD) for quantifying landscape smoothness. We demonstrate Landscaper's effectiveness across various architectures and tasks, including those involving pre-trained language models, showing that SMAD captures training transitions, such as landscape simplification, that conventional metrics miss. We also illustrate Landscaper's performance in challenging chemical property prediction tasks, where SMAD can serve as a metric for out-of-distribution generalization, offering valuable insights for model diagnostics and architecture design in data-scarce scientific machine learning scenarios.

Landscaper: Understanding Loss Landscapes Through Multi-Dimensional Topological Analysis

TL;DR

This work tackles understanding neural network optimization and generalization through high-dimensional loss landscapes, where traditional low-dimensional analyses fall short. It introduces Landscaper, an open-source framework that combines Hessian-based subspace analysis with topological data analysis and introduces as a global landscape smoothness metric. Across CNNs, Transformers, GNNs, and SciML tasks, Landscaper reveals training dynamics and OOD generalization patterns that curvature-based metrics miss, demonstrating the value of global topological insights for model diagnostics. The approach offers practical impact for architecture design in data-scarce settings and outlines clear paths for scalability and broader validation in future work.

Abstract

Loss landscapes are a powerful tool for understanding neural network optimization and generalization, yet traditional low-dimensional analyses often miss complex topological features. We present Landscaper, an open-source Python package for arbitrary-dimensional loss landscape analysis. Landscaper combines Hessian-based subspace construction with topological data analysis to reveal geometric structures such as basin hierarchy and connectivity. A key component is the Saddle-Minimum Average Distance (SMAD) for quantifying landscape smoothness. We demonstrate Landscaper's effectiveness across various architectures and tasks, including those involving pre-trained language models, showing that SMAD captures training transitions, such as landscape simplification, that conventional metrics miss. We also illustrate Landscaper's performance in challenging chemical property prediction tasks, where SMAD can serve as a metric for out-of-distribution generalization, offering valuable insights for model diagnostics and architecture design in data-scarce scientific machine learning scenarios.
Paper Structure (32 sections, 2 equations, 8 figures, 7 tables)

This paper contains 32 sections, 2 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Landscaper's workflow involves (a) constructing landscapes in arbitrary-dimensional subspaces; (b) providing visualizations (e.g., landscape profiles, merge trees, and persistence barcodes) to reveal rich geometric/topological features; and (c) quantifying representation diversity and landscape smoothness with metrics (e.g., Hessian trace, SMAD).
  • Figure 2: MultiBERT loss landscape analysis (Seed 2). (a)-(b) Contour plots of the 2D loss landscape show landscape smoothing from 20k to 40k steps. (c)-(d) Quantitative metrics diverge: SMAD decreases (consistent with smoothing), while the Hessian trace increases due to local curvature sharpening.
  • Figure 3: (a) $2\text{D}$ RMSE loss contours and 3D/5D landscape profiles derived from top Hessian eigenvectors, illustrating 3 distinct eigenvalue patterns: (1) similar top values, (2) sharp drop after the first eigenvalue, and (3) significant negative eigenvalues. (b) Merge trees and persistence barcodes, paired with their corresponding $2\text{D}$ RMSE loss contours on split sets of DimeNet++ train with 10-fold augmentation (seed $0$).
  • Figure 4: Two standard TDA visualization methods are employed in Landscaper: the merge tree and the persistence barcode. Critical points are extracted from the loss landscape and are organized into a merge tree with their persistence values; minima are connected with their corresponding saddle-points. The persistence barcode can be seen as a 1D projection of the merge tree, which displays the life-spans of the connected components formed by saddle-minima pairs, starting from their appearance (birth) to when they merge (death).
  • Figure 5: Profile Visualization. Representing a merge tree as a topological profile. (A) shows a single basin corresponding to a merge tree with a single branch, and (B) shows multiple basins corresponding to multiple branches. (C) demonstrates the color scheme.
  • ...and 3 more figures