Table of Contents
Fetching ...

Evaluating Loss Landscapes from a Topology Perspective

Tiankai Xie, Caleb Geniesse, Jiaqing Chen, Yaoqing Yang, Dmitriy Morozov, Michael W. Mahoney, Ross Maciejewski, Gunther H. Weber

TL;DR

This work characterize the underlying shape (or topology) of loss landscapes, quantifying the topology to reveal new insights about neural networks and shows how quantifying the shape of loss landscapes can provide new insights into model performance and learning dynamics.

Abstract

Characterizing the loss of a neural network with respect to model parameters, i.e., the loss landscape, can provide valuable insights into properties of that model. Various methods for visualizing loss landscapes have been proposed, but less emphasis has been placed on quantifying and extracting actionable and reproducible insights from these complex representations. Inspired by powerful tools from topological data analysis (TDA) for summarizing the structure of high-dimensional data, here we characterize the underlying shape (or topology) of loss landscapes, quantifying the topology to reveal new insights about neural networks. To relate our findings to the machine learning (ML) literature, we compute simple performance metrics (e.g., accuracy, error), and we characterize the local structure of loss landscapes using Hessian-based metrics (e.g., largest eigenvalue, trace, eigenvalue spectral density). Following this approach, we study established models from image pattern recognition (e.g., ResNets) and scientific ML (e.g., physics-informed neural networks), and we show how quantifying the shape of loss landscapes can provide new insights into model performance and learning dynamics.

Evaluating Loss Landscapes from a Topology Perspective

TL;DR

This work characterize the underlying shape (or topology) of loss landscapes, quantifying the topology to reveal new insights about neural networks and shows how quantifying the shape of loss landscapes can provide new insights into model performance and learning dynamics.

Abstract

Characterizing the loss of a neural network with respect to model parameters, i.e., the loss landscape, can provide valuable insights into properties of that model. Various methods for visualizing loss landscapes have been proposed, but less emphasis has been placed on quantifying and extracting actionable and reproducible insights from these complex representations. Inspired by powerful tools from topological data analysis (TDA) for summarizing the structure of high-dimensional data, here we characterize the underlying shape (or topology) of loss landscapes, quantifying the topology to reveal new insights about neural networks. To relate our findings to the machine learning (ML) literature, we compute simple performance metrics (e.g., accuracy, error), and we characterize the local structure of loss landscapes using Hessian-based metrics (e.g., largest eigenvalue, trace, eigenvalue spectral density). Following this approach, we study established models from image pattern recognition (e.g., ResNets) and scientific ML (e.g., physics-informed neural networks), and we show how quantifying the shape of loss landscapes can provide new insights into model performance and learning dynamics.

Paper Structure

This paper contains 11 sections, 4 equations, 6 figures.

Figures (6)

  • Figure 1: Visualizing loss landscapes for ResNet-20. Compare with yao2020pyhessian.
  • Figure 2: Visualizing the failure modes of PINNs. Compare with krishnapriyan2021characterizing.
  • Figure 3: Illustration of generating the two-dimensional loss landscape. In this work, to compute a loss landscape, we first construct a subspace defined by two vectors, $\theta_1$ and $\theta_2$. Within this subspace, we perform model interpolation and evaluate the corresponding loss values.
  • Figure 4: Our loss landscape analysis pipeline. The pipeline includes six stages: (1) Subspace Definition: Define the loss landscape subspace using random-based or Hessian-based directions. (2) Loss Computation: Calculate loss values for coordinate locations using our coordinate-based loss computation method. (3) Data Representation: Transform the loss landscape into suitable data structures for TDA. (4) Topological Analysis: Extract the merge tree and persistence diagram. (5) Quantitative Evaluation: Quantify the merge tree and persistence diagram. (6) Loss Landscape Property Evaluation: Relate the TDA-based metrics to loss landscape properties.
  • Figure 5: Quantifying the loss landscape for ResNet-20, with and without residual connections. We trained each version of the model four separate times, each time using a unique random seed (e.g., 0, 123, 123456, and 2023). Here, we numerically verify the observations we made in Fig. \ref{['fig:resnet_random']} and provide additional insights based on persistence diagrams (not shown). We quantified the merge tree and persistence diagram by counting the number of saddle points and computing the average persistence, respectively. We compare our results with traditional machine learning metrics, including the Accuracy, Top-1 Hessian Eigenvalue, and Hessian Trace. Here, we show the relationship between those ML-based metrics and our two TDA-based metrics. These plots provide additional insights beyond the qualitative differences in the loss landscapes we observed in Fig. \ref{['fig:resnet_random']}, confirming that the landscapes for ResNet-20 models without residual connections correspond to merge trees with a higher number of saddle points (left column). In contrast, we see that these models (without residual connections) display a lower average persistence (right column). We observe an inverse relationship between the number of saddle points in the merge tree and the ML-based metrics, but a direct relationship between the average persistence and the same ML-based metrics. Together, these results provide insight into how changing the architecture of a neural network like ResNet-20 (i.e., by adding residual connections) can result in a "smoother" (and thereby easier to optimize) loss landscape.
  • ...and 1 more figures