Table of Contents
Fetching ...

UserBoost: Generating User-specific Synthetic Data for Faster Enrolment into Behavioural Biometric Systems

George Webber, Jack Sturgess, Ivan Martinovic

TL;DR

This work tackles the enrollment burden in behavioural biometric smartwatch authentication by generating user-specific synthetic IMU gestures using a regularised autoencoder, enabling effective training of a lightweight classifier. The authors design a VAE-based generative model with explicit latent-space regularisation and a user-clustering objective, and they explore multiple latent-space sampling strategies to produce diverse, high-fidelity synthetic gestures. Evaluation on the WatchAuth dataset shows that synthetic data can substantially reduce the number of real gestures required for enrolment (up to ~40% fewer) while maintaining or improving usability metrics such as FAR@0 and AUROC, though gains vary by user. The approach offers a practical path to faster, privacy-preserving enrolment in resource-constrained wearable authentication and suggests avenues for broader application and integration with other generative methods.

Abstract

Behavioural biometric authentication systems entail an enrolment period that is burdensome for the user. In this work, we explore generating synthetic gestures from a few real user gestures with generative deep learning, with the application of training a simple (i.e. non-deep-learned) authentication model. Specifically, we show that utilising synthetic data alongside real data can reduce the number of real datapoints a user must provide to enrol into a biometric system. To validate our methods, we use the publicly available dataset of WatchAuth, a system proposed in 2022 for authenticating smartwatch payments using the physical gesture of reaching towards a payment terminal. We develop a regularised autoencoder model for generating synthetic user-specific wrist motion data representing these physical gestures, and demonstrate the diversity and fidelity of our synthetic gestures. We show that using synthetic gestures in training can improve classification ability for a real-world system. Through this technique we can reduce the number of gestures required to enrol a user into a WatchAuth-like system by more than 40% without negatively impacting its error rates.

UserBoost: Generating User-specific Synthetic Data for Faster Enrolment into Behavioural Biometric Systems

TL;DR

This work tackles the enrollment burden in behavioural biometric smartwatch authentication by generating user-specific synthetic IMU gestures using a regularised autoencoder, enabling effective training of a lightweight classifier. The authors design a VAE-based generative model with explicit latent-space regularisation and a user-clustering objective, and they explore multiple latent-space sampling strategies to produce diverse, high-fidelity synthetic gestures. Evaluation on the WatchAuth dataset shows that synthetic data can substantially reduce the number of real gestures required for enrolment (up to ~40% fewer) while maintaining or improving usability metrics such as FAR@0 and AUROC, though gains vary by user. The approach offers a practical path to faster, privacy-preserving enrolment in resource-constrained wearable authentication and suggests avenues for broader application and integration with other generative methods.

Abstract

Behavioural biometric authentication systems entail an enrolment period that is burdensome for the user. In this work, we explore generating synthetic gestures from a few real user gestures with generative deep learning, with the application of training a simple (i.e. non-deep-learned) authentication model. Specifically, we show that utilising synthetic data alongside real data can reduce the number of real datapoints a user must provide to enrol into a biometric system. To validate our methods, we use the publicly available dataset of WatchAuth, a system proposed in 2022 for authenticating smartwatch payments using the physical gesture of reaching towards a payment terminal. We develop a regularised autoencoder model for generating synthetic user-specific wrist motion data representing these physical gestures, and demonstrate the diversity and fidelity of our synthetic gestures. We show that using synthetic gestures in training can improve classification ability for a real-world system. Through this technique we can reduce the number of gestures required to enrol a user into a WatchAuth-like system by more than 40% without negatively impacting its error rates.
Paper Structure (35 sections, 7 equations, 15 figures, 3 tables)

This paper contains 35 sections, 7 equations, 15 figures, 3 tables.

Figures (15)

  • Figure 1: Left: A smartwatch user making a payment gesture. Right: The positioning of six contactless payment terminals used in WatchAuth experiments; the seventh terminal (centre) was handheld (image courtesy of sturgessWatchAuthUserAuthentication2022).
  • Figure 2: The dissimilarity of two WatchAuth gestures, measured with (a) Mean Square Error (MSE) distance, (b) Dynamic Time Warping (DTW) distance, and (c) Keogh's lower bound. Matching between points is shown in grey.
  • Figure 3: A plot of FAR and FRR against decision threshold $T$ from a RF100 classifier (defined in Section \ref{['sec:RF100']}). Marked points show thresholds. Observe that the true EER may lie between $\sim30\%$ and $\sim45\%$.
  • Figure 4: The first two principal components (PCs) of the training data's latent space embedding for an unregularised autoencoder trained using the KLB-mod + Feature loss.
  • Figure 5: Training data latent space mean embeddings for VAEs trained with decreasing $\beta$ values, decreasing the influence of regularisation. Different colours represent different users.
  • ...and 10 more figures