Table of Contents
Fetching ...

Classification of Firn Data via Topological Features

Sarah Day, Jesse Dimino, Matt Jester, Kaitlin Keegan, Thomas Weighill

TL;DR

The paper addresses depth-prediction from firn micro-CT images using invariant topological features. It compares sublevel-set and distance-transform persistent-homology featurizations, translated into Betti and Gaussian persistence curves, and fed into random forests. Results reveal trade-offs: sublevel-set features achieve high accuracy on unblurred data but are sensitive to noise and out-of-sample depths, while distance-transform features are more robust to blur and better at extrapolating to unseen depths. The findings highlight the need to balance scale sensitivity and preprocessing choices in TDA-based texture analysis, with implications for geoscience applications in firn densification studies and cross-site depth inference.

Abstract

In this paper we evaluate the performance of topological features for generalizable and robust classification of firn image data, with the broader goal of understanding the advantages, pitfalls, and trade-offs in topological featurization. Firn refers to layers of granular snow within glaciers that haven't been compressed into ice. This compactification process imposes distinct topological and geometric structure on firn that varies with depth within the firn column, making topological data analysis (TDA) a natural choice for understanding the connection between depth and structure. We use two classes of topological features, sublevel set features and distance transform features, together with persistence curves, to predict sample depth from microCT images. A range of challenging training-test scenarios reveals that no one choice of method dominates in all categories, and uncoveres a web of trade-offs between accuracy, interpretability, and generalizability.

Classification of Firn Data via Topological Features

TL;DR

The paper addresses depth-prediction from firn micro-CT images using invariant topological features. It compares sublevel-set and distance-transform persistent-homology featurizations, translated into Betti and Gaussian persistence curves, and fed into random forests. Results reveal trade-offs: sublevel-set features achieve high accuracy on unblurred data but are sensitive to noise and out-of-sample depths, while distance-transform features are more robust to blur and better at extrapolating to unseen depths. The findings highlight the need to balance scale sensitivity and preprocessing choices in TDA-based texture analysis, with implications for geoscience applications in firn densification studies and cross-site depth inference.

Abstract

In this paper we evaluate the performance of topological features for generalizable and robust classification of firn image data, with the broader goal of understanding the advantages, pitfalls, and trade-offs in topological featurization. Firn refers to layers of granular snow within glaciers that haven't been compressed into ice. This compactification process imposes distinct topological and geometric structure on firn that varies with depth within the firn column, making topological data analysis (TDA) a natural choice for understanding the connection between depth and structure. We use two classes of topological features, sublevel set features and distance transform features, together with persistence curves, to predict sample depth from microCT images. A range of challenging training-test scenarios reveals that no one choice of method dominates in all categories, and uncoveres a web of trade-offs between accuracy, interpretability, and generalizability.

Paper Structure

This paper contains 14 sections, 8 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1.1: One sample image per depth for the firn dataset. Gray areas are ice, while the black areas show the pore (air) space.
  • Figure 2.1: Computing the $0\textsuperscript{th}$ Betti curve (right) from the $0\textsuperscript{th}$-dimensional persistence diagram (left). The red dashed lines indicate the extent of the fundamental box at each threshold $t$.
  • Figure 3.1: Visual representations of the two topological featurizations for an example image, using Betti curves as an example.
  • Figure 3.2: The original image and two manipulations used to evaluate predictor generalizability. The original image is shown in (a). In (b), we show the quadrants used for the Split and Split BR scenarios. The blurred version of the image (applied to the test set in Blurred) is shown in (c).
  • Figure 4.1: On the left, the average Betti curve vector $\mathbf{v}_{\mathrm{Betti}}^{SS}$ by depth for the sublevel set featurization, showing good separation but conspicuously high peak values. On the right, we show a sample image from the $78$m depth threholded at $t = 120$, near the peak of the average curve for that depth. We see that the topology of the image at this threshold is dominated by granular noise.

Theorems & Definitions (2)

  • Definition 2.1: Betti curve
  • Definition 2.2: Gaussian persistence curve