Table of Contents
Fetching ...

Towards virtual painting recolouring using Vision Transformer on X-Ray Fluorescence datacubes

Alessandro Bombini, Fernando García-Avello Bofías, Francesca Giambi, Chiara Ruberto

TL;DR

This work tackles virtual recolouring of MA-XRF data cubes from cultural heritage artifacts by generating a synthetic spectral dataset and learning a compact spectral embedding to enable robust learning despite limited real data. A Deep Variational Embedding produces a 3-dimensional latent representation of spectra, which is used to create embedded MA-XRF images that feed a Vision Transformer-based recolouring network (SmallUViT). On synthetic data, the approach achieves a MS-SSIM of about 0.92 and perceptual color fidelity via a color-aware loss, demonstrating the feasibility of mapping embedded spectral information to RGB images for conservation visualization. The study lays groundwork for domain adaptation to real MA-XRF data, with planned extensions including unsupervised adaptation and Pix2Pix-like refinements to improve reconstruction quality and generalization to actual artworks.

Abstract

In this contribution, we define (and test) a pipeline to perform virtual painting recolouring using raw data of X-Ray Fluorescence (XRF) analysis on pictorial artworks. To circumvent the small dataset size, we generate a synthetic dataset, starting from a database of XRF spectra; furthermore, to ensure a better generalisation capacity (and to tackle the issue of in-memory size and inference time), we define a Deep Variational Embedding network to embed the XRF spectra into a lower dimensional, K-Means friendly, metric space. We thus train a set of models to assign coloured images to embedded XRF images. We report here the devised pipeline performances in terms of visual quality metrics, and we close on a discussion on the results.

Towards virtual painting recolouring using Vision Transformer on X-Ray Fluorescence datacubes

TL;DR

This work tackles virtual recolouring of MA-XRF data cubes from cultural heritage artifacts by generating a synthetic spectral dataset and learning a compact spectral embedding to enable robust learning despite limited real data. A Deep Variational Embedding produces a 3-dimensional latent representation of spectra, which is used to create embedded MA-XRF images that feed a Vision Transformer-based recolouring network (SmallUViT). On synthetic data, the approach achieves a MS-SSIM of about 0.92 and perceptual color fidelity via a color-aware loss, demonstrating the feasibility of mapping embedded spectral information to RGB images for conservation visualization. The study lays groundwork for domain adaptation to real MA-XRF data, with planned extensions including unsupervised adaptation and Pix2Pix-like refinements to improve reconstruction quality and generalization to actual artworks.

Abstract

In this contribution, we define (and test) a pipeline to perform virtual painting recolouring using raw data of X-Ray Fluorescence (XRF) analysis on pictorial artworks. To circumvent the small dataset size, we generate a synthetic dataset, starting from a database of XRF spectra; furthermore, to ensure a better generalisation capacity (and to tackle the issue of in-memory size and inference time), we define a Deep Variational Embedding network to embed the XRF spectra into a lower dimensional, K-Means friendly, metric space. We thus train a set of models to assign coloured images to embedded XRF images. We report here the devised pipeline performances in terms of visual quality metrics, and we close on a discussion on the results.

Paper Structure

This paper contains 14 sections, 3 equations, 10 figures, 1 algorithm.

Figures (10)

  • Figure 1: Visual Abstract of the pipeline devised in this Paper. The flow is described in the main text, and it comprises 4 steps: Synthetic Data Generation of spectral signal (in the red region in the figure); a trained Deep Embedding model (in the blue region); an embedded synthetic dataset of embedded MA-XRF images (in the yellow region), and finally a computer vision model to perform virtual recoloring.
  • Figure 2: Visual representation of the Deep Variational Embedding model
  • Figure 3: Example of input-real output pair of the embedded dataset.
  • Figure 4: Graphical representation of the SmallUViT model.
  • Figure 5: Training history Plot
  • ...and 5 more figures