Table of Contents
Fetching ...

LossLens: Diagnostics for Machine Learning through Loss Landscape Visual Analytics

Tiankai Xie, Jiaqing Chen, Yaoqing Yang, Caleb Geniesse, Ge Shi, Ajinkya Chaudhari, John Kevin Cava, Michael W. Mahoney, Talita Perciano, Gunther H. Weber, Ross Maciejewski

TL;DR

LossLens addresses the challenge of interpreting high-dimensional loss landscapes by introducing a multi-scale visual analytics framework that integrates global metrics (e.g., mode connectivity and $CKA$ similarity) with local curvature and topology (top Hessian eigenvalues, persistence diagrams, and merge trees). It extends Yang et al.'s taxonomy to provide a cohesive representation linking model-level and landscape-level information. The authors demonstrate two case studies—architecture alteration in ResNet-20 and loss-function alteration in PINNs—to show how architectural choices and physical parameters reshape both global connectivity and local minima. Expert interviews validate the framework's usefulness while highlighting scalability and usability considerations, guiding future improvements such as higher-dimensional projections and broader model support. Overall, LossLens offers a scalable, interpretable workflow for diagnosing and understanding deep learning models through multi-scale loss-landscape visualization.

Abstract

Modern machine learning often relies on optimizing a neural network's parameters using a loss function to learn complex features. Beyond training, examining the loss function with respect to a network's parameters (i.e., as a loss landscape) can reveal insights into the architecture and learning process. While the local structure of the loss landscape surrounding an individual solution can be characterized using a variety of approaches, the global structure of a loss landscape, which includes potentially many local minima corresponding to different solutions, remains far more difficult to conceptualize and visualize. To address this difficulty, we introduce LossLens, a visual analytics framework that explores loss landscapes at multiple scales. LossLens integrates metrics from global and local scales into a comprehensive visual representation, enhancing model diagnostics. We demonstrate LossLens through two case studies: visualizing how residual connections influence a ResNet-20, and visualizing how physical parameters influence a physics-informed neural network (PINN) solving a simple convection problem.

LossLens: Diagnostics for Machine Learning through Loss Landscape Visual Analytics

TL;DR

LossLens addresses the challenge of interpreting high-dimensional loss landscapes by introducing a multi-scale visual analytics framework that integrates global metrics (e.g., mode connectivity and similarity) with local curvature and topology (top Hessian eigenvalues, persistence diagrams, and merge trees). It extends Yang et al.'s taxonomy to provide a cohesive representation linking model-level and landscape-level information. The authors demonstrate two case studies—architecture alteration in ResNet-20 and loss-function alteration in PINNs—to show how architectural choices and physical parameters reshape both global connectivity and local minima. Expert interviews validate the framework's usefulness while highlighting scalability and usability considerations, guiding future improvements such as higher-dimensional projections and broader model support. Overall, LossLens offers a scalable, interpretable workflow for diagnosing and understanding deep learning models through multi-scale loss-landscape visualization.

Abstract

Modern machine learning often relies on optimizing a neural network's parameters using a loss function to learn complex features. Beyond training, examining the loss function with respect to a network's parameters (i.e., as a loss landscape) can reveal insights into the architecture and learning process. While the local structure of the loss landscape surrounding an individual solution can be characterized using a variety of approaches, the global structure of a loss landscape, which includes potentially many local minima corresponding to different solutions, remains far more difficult to conceptualize and visualize. To address this difficulty, we introduce LossLens, a visual analytics framework that explores loss landscapes at multiple scales. LossLens integrates metrics from global and local scales into a comprehensive visual representation, enhancing model diagnostics. We demonstrate LossLens through two case studies: visualizing how residual connections influence a ResNet-20, and visualizing how physical parameters influence a physics-informed neural network (PINN) solving a simple convection problem.

Paper Structure

This paper contains 21 sections, 6 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Caricature of different types of loss landscapes. Loss landscapes can be classified into five types yang2021taxonomizing: globally well-connected versus globally poorly-connected loss landscapes; and locally sharp versus locally flat loss landscapes, where locally flat and globally well-connected loss landscapes can be further distinguished based on the similarity between models. Globally poorly connected loss landscapes have high barriers between minimum loss areas, and globally well-connected loss landscapes tend to have low-energy (low-loss) paths between models. Locally flat loss landscapes are "smoother" than locally sharp landscapes. Metrics that are used for categorizing the types are ① Hessian; ② Mode Connectivity; ③ CKA Similarity and ④ Topological Data Analysis. Here, local sharpness is measured by the Merge Tree (left) and the Persistence Diagram (right). Multiple branches of the merge tree and rough persistence diagram indicate the "bad" landscape, and vice versa.
  • Figure 2: Our framework consists of two stages: (A) Global Analysis, and (B) Local Analysis. In Stage A, analysts evaluate a model's global structure by computing a combination of metrics for trained models and exploring the visualized global structure for models of interest. Next, in Stage B, analysts investigate the detailed local features of a selected model's loss landscape. To mitigate the situation when visualized loss contours are hard to evaluate, persistence diagram and merge tree visualization techniques are applied to help analysts better analyze the local structure.
  • Figure 3: Visual encodings in the global structure view. Each node in the global structure view has three layers: The outer ring visualizes the performance of the model, e.g., accuracy, recall, precision, and F1; the middle ring visualizes the top Hessian eigenvalues of the model; and the color of the inner circle represents one unique model so that analysts can distinguish between different models of interest. Edge properties represent the mode connectivity between models: a thick and straight line implies good mode connectivity and a thin and curved line implies poor mode connectivity. The position of each model in the global structure view is visualized by projection methods along with the CKA similarity.
  • Figure 4: Global Structure Design: Versions A and C are preferred by experts.
  • Figure 5: LossLens compares ResNet-20 with and without residual connections, ① revealing that despite similar accuracy (90-92%), ResNet-20 with residuals (ResNet-20-R) exhibits better mode connectivity,② a flatter loss landscape, and ③ ④ more branches in the merge tree compared to ResNet-20 without residuals (ResNet-20-NR). This highlights how residual connections improve the ResNet architecture beyond performance.
  • ...and 3 more figures