StyleX: A Trainable Metric for X-ray Style Distances
Dominik Eckert, Christopher Syben, Christian Hümmer, Ludwig Ritschl, Steffen Kappler, Sebastian Stober
TL;DR
StyleX addresses the lack of a quantitative style-distance metric for X-ray images by learning style embeddings with Simple Siamese training on identically styled pairs. It defines a distance as the cosine similarity between embeddings produced by a ResNet-18 encoder and demonstrates robust, content-invariant style representations without requiring explicit style-distance labels. Experiments on MBTST mammograms using LAP and PASS pipelines show meaningful style clustering and that StyleX distances align with perceptual style differences for both matching and non-matching content, including unseen styles. This work enables automatic style selection and has potential as a style-loss for imaging pipeline optimization across X-ray modalities.
Abstract
The progression of X-ray technology introduces diverse image styles that need to be adapted to the preferences of radiologists. To support this task, we introduce a novel deep learning-based metric that quantifies style differences of non-matching image pairs. At the heart of our metric is an encoder capable of generating X-ray image style representations. This encoder is trained without any explicit knowledge of style distances by exploiting Simple Siamese learning. During inference, the style representations produced by the encoder are used to calculate a distance metric for non-matching image pairs. Our experiments investigate the proposed concept for a disclosed reproducible and a proprietary image processing pipeline along two dimensions: First, we use a t-distributed stochastic neighbor embedding (t-SNE) analysis to illustrate that the encoder outputs provide meaningful and discriminative style representations. Second, the proposed metric calculated from the encoder outputs is shown to quantify style distances for non-matching pairs in good alignment with the human perception. These results confirm that our proposed method is a promising technique to quantify style differences, which can be used for guided style selection as well as automatic optimization of image pipeline parameters.
