Table of Contents
Fetching ...

SAGE: Saliency-Guided Contrastive Embeddings

Colton R. Crum, Christopher Sweet, Adam Czajka

TL;DR

SAGE introduces Saliency-Guided Contrastive Embeddings, a training framework that shifts human saliency guidance from image space into the model's embedding space. It uses saliency-based augmentations, logit alignment with Jensen-Shannon divergence and temperature scaling, and a contrastive triplet loss on embeddings to steer the network toward salient features while avoiding non-salient ones. The method achieves state-of-the-art open-set performance across Iris POD, Chest X-ray anomalies, and synthetic-face detection, demonstrating robust generalization across backbones including CNNs and vision transformers. Sanity checks with inverted saliency maps confirm that performance gains stem from genuine saliency guidance rather than mere regularization, and ablations identify optimal temperature and augmentation settings. Overall, SAGE offers a scalable, architecture-agnostic approach to integrate human perceptual priors into deep learning for high-risk tasks.

Abstract

Integrating human perceptual priors into the training of neural networks has been shown to raise model generalization, serve as an effective regularizer, and align models with human expertise for applications in high-risk domains. Existing approaches to integrate saliency into model training often rely on internal model mechanisms, which recent research suggests may be unreliable. Our insight is that many challenges associated with saliency-guided training stem from the placement of the guidance approaches solely within the image space. Instead, we move away from the image space, use the model's latent space embeddings to steer human guidance during training, and we propose SAGE (Saliency-Guided Contrastive Embeddings): a loss function that integrates human saliency into network training using contrastive embeddings. We apply salient-preserving and saliency-degrading signal augmentations to the input and capture the changes in embeddings and model logits. We guide the model towards salient features and away from non-salient features using a contrastive triplet loss. Additionally, we perform a sanity check on the logit distributions to ensure that the model outputs match the saliency-based augmentations. We demonstrate a boost in classification performance across both open- and closed-set scenarios against SOTA saliency-based methods, showing SAGE's effectiveness across various backbones, and include experiments to suggest its wide generalization across tasks.

SAGE: Saliency-Guided Contrastive Embeddings

TL;DR

SAGE introduces Saliency-Guided Contrastive Embeddings, a training framework that shifts human saliency guidance from image space into the model's embedding space. It uses saliency-based augmentations, logit alignment with Jensen-Shannon divergence and temperature scaling, and a contrastive triplet loss on embeddings to steer the network toward salient features while avoiding non-salient ones. The method achieves state-of-the-art open-set performance across Iris POD, Chest X-ray anomalies, and synthetic-face detection, demonstrating robust generalization across backbones including CNNs and vision transformers. Sanity checks with inverted saliency maps confirm that performance gains stem from genuine saliency guidance rather than mere regularization, and ablations identify optimal temperature and augmentation settings. Overall, SAGE offers a scalable, architecture-agnostic approach to integrate human perceptual priors into deep learning for high-risk tasks.

Abstract

Integrating human perceptual priors into the training of neural networks has been shown to raise model generalization, serve as an effective regularizer, and align models with human expertise for applications in high-risk domains. Existing approaches to integrate saliency into model training often rely on internal model mechanisms, which recent research suggests may be unreliable. Our insight is that many challenges associated with saliency-guided training stem from the placement of the guidance approaches solely within the image space. Instead, we move away from the image space, use the model's latent space embeddings to steer human guidance during training, and we propose SAGE (Saliency-Guided Contrastive Embeddings): a loss function that integrates human saliency into network training using contrastive embeddings. We apply salient-preserving and saliency-degrading signal augmentations to the input and capture the changes in embeddings and model logits. We guide the model towards salient features and away from non-salient features using a contrastive triplet loss. Additionally, we perform a sanity check on the logit distributions to ensure that the model outputs match the saliency-based augmentations. We demonstrate a boost in classification performance across both open- and closed-set scenarios against SOTA saliency-based methods, showing SAGE's effectiveness across various backbones, and include experiments to suggest its wide generalization across tasks.

Paper Structure

This paper contains 26 sections, 8 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: An overview of the two main components of SAGE. Saliency-based Augmentations (left): Input $x$ is blurred to generate $x'$, a degraded image that removes high-frequency information. Using salient features indicated by human experts, we apply selective blurring to generate $\tilde{x}$ to preserve salient features, and $\tilde{x}'$ to degrade salient features. During training, the model makes a forward pass for each $x$ with $y$ output logits and $z$ embeddings captured by a predefined hook. Our loss encourages an alignment of logit distributions (middle column) and guides respective embeddings using a contrastive triplet loss (right).
  • Figure 2: Examples and notation of the saliency-based augmentations described in this paper (left to right): input sample $x_i$, human saliency map (indicating salient regions) $x_h$, selective blurring that preserves the salient regions $\tilde{x_i}$. Right side: blurring on entire image $x'_i$, the inverse of saliency map (indicating non-salient regions) $x'_h$, selective blurring that degrades the salient regions $\tilde{x_i}'$.