Towards virtual painting recolouring using Vision Transformer on X-Ray Fluorescence datacubes
Alessandro Bombini, Fernando García-Avello Bofías, Francesca Giambi, Chiara Ruberto
TL;DR
This work tackles virtual recolouring of MA-XRF data cubes from cultural heritage artifacts by generating a synthetic spectral dataset and learning a compact spectral embedding to enable robust learning despite limited real data. A Deep Variational Embedding produces a 3-dimensional latent representation of spectra, which is used to create embedded MA-XRF images that feed a Vision Transformer-based recolouring network (SmallUViT). On synthetic data, the approach achieves a MS-SSIM of about 0.92 and perceptual color fidelity via a color-aware loss, demonstrating the feasibility of mapping embedded spectral information to RGB images for conservation visualization. The study lays groundwork for domain adaptation to real MA-XRF data, with planned extensions including unsupervised adaptation and Pix2Pix-like refinements to improve reconstruction quality and generalization to actual artworks.
Abstract
In this contribution, we define (and test) a pipeline to perform virtual painting recolouring using raw data of X-Ray Fluorescence (XRF) analysis on pictorial artworks. To circumvent the small dataset size, we generate a synthetic dataset, starting from a database of XRF spectra; furthermore, to ensure a better generalisation capacity (and to tackle the issue of in-memory size and inference time), we define a Deep Variational Embedding network to embed the XRF spectra into a lower dimensional, K-Means friendly, metric space. We thus train a set of models to assign coloured images to embedded XRF images. We report here the devised pipeline performances in terms of visual quality metrics, and we close on a discussion on the results.
