Table of Contents
Fetching ...

Model editing for distribution shifts in uranium oxide morphological analysis

Davis Brown, Cody Nizinski, Madelyn Shapiro, Corey Fallon, Tianzhixi Yin, Henry Kvinge, Jonathan H. Tu

TL;DR

The paper tackles the challenge of distribution shifts in SEM-based morphological analysis of uranium oxide synthesis by applying model editing to adapt classifiers trained on base data to aging and detector shifts. It compares low-rank editing with surgical finetuning and full finetuning, showing that targeted, low-rank edits generally outperform full or broad finetuning, particularly for aging-induced feature changes, while detector shifts remain harder to address. The findings suggest model editing as a practical, data-efficient approach to incorporate aging-study data and instrument variations with minimal impact on original performance, with future work exploring multi-detector exemplar mixtures and generative domain adaptation to broaden shift coverage.

Abstract

Deep learning still struggles with certain kinds of scientific data. Notably, pretraining data may not provide coverage of relevant distribution shifts (e.g., shifts induced via the use of different measurement instruments). We consider deep learning models trained to classify the synthesis conditions of uranium ore concentrates (UOCs) and show that model editing is particularly effective for improving generalization to distribution shifts common in this domain. In particular, model editing outperforms finetuning on two curated datasets comprising of micrographs taken of U$_{3}$O$_{8}$ aged in humidity chambers and micrographs acquired with different scanning electron microscopes, respectively.

Model editing for distribution shifts in uranium oxide morphological analysis

TL;DR

The paper tackles the challenge of distribution shifts in SEM-based morphological analysis of uranium oxide synthesis by applying model editing to adapt classifiers trained on base data to aging and detector shifts. It compares low-rank editing with surgical finetuning and full finetuning, showing that targeted, low-rank edits generally outperform full or broad finetuning, particularly for aging-induced feature changes, while detector shifts remain harder to address. The findings suggest model editing as a practical, data-efficient approach to incorporate aging-study data and instrument variations with minimal impact on original performance, with future work exploring multi-detector exemplar mixtures and generative domain adaptation to broaden shift coverage.

Abstract

Deep learning still struggles with certain kinds of scientific data. Notably, pretraining data may not provide coverage of relevant distribution shifts (e.g., shifts induced via the use of different measurement instruments). We consider deep learning models trained to classify the synthesis conditions of uranium ore concentrates (UOCs) and show that model editing is particularly effective for improving generalization to distribution shifts common in this domain. In particular, model editing outperforms finetuning on two curated datasets comprising of micrographs taken of UO aged in humidity chambers and micrographs acquired with different scanning electron microscopes, respectively.
Paper Structure (10 sections, 5 figures, 1 table)

This paper contains 10 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Comparison of how well models finetuned on an aging dataset $D$ generalize to other ages. Both surgical finetuning ((a)) and low-rank editing ((b)) generalize to earlier ages. Also, all edited models outperform the baseline accuracy in (c)). Note that because we drop runs that incur an accuracy drop on the baseline validation set of $>1.5\%$, there are no succesful full-model finetuning runs. See \ref{['fig:agingplots2']} for a more permissive threshold.
  • Figure 2: Comparison for how well models updated for an aging dataset $D$ generalize to other ages, allowing up to a $7\%$ drop in accuracy on the original validation set. Even for this more aggressive editing setting (as opposed to \ref{['fig:agingplots']}, which used a threshold of $1.5\%$), surgical finetuning and low-rank editing tend to outperform full finetuning. The aging accuracies for the unedited model are given in \ref{['fig:agingplot3']}.
  • Figure 3: Model editing performance on the T2 SE detector dataset.
  • Figure 4: Images from aging datasets. Note the (visually apparent) differences in the 60 day aged samples from the others.
  • Figure 5: SEM detector comparison images from Nova NanoSEM and Teneo models.