Table of Contents
Fetching ...

Contrastive Learning for Continuous Touch-Based Authentication

Mengyu Qiao, Yunpeng Zhai, Yang Wang

TL;DR

This work proposes a unified contrastive learning framework for continuous authentication in a non-disruptive manner, which leverages a Temporal Masked Autoencoder to extract temporal patterns from raw multi-sensor data streams, capturing continuous motion and gesture dynamics.

Abstract

Smart mobile devices have become indispensable in modern daily life, where sensitive information is frequently processed, stored, and transmitted-posing critical demands for robust security controls. Given that touchscreens are the primary medium for human-device interaction, continuous user authentication based on touch behavior presents a natural and seamless security solution. While existing methods predominantly adopt binary classification under single-modal learning settings, we propose a unified contrastive learning framework for continuous authentication in a non-disruptive manner. Specifically, the proposed method leverages a Temporal Masked Autoencoder to extract temporal patterns from raw multi-sensor data streams, capturing continuous motion and gesture dynamics. The pre-trained TMAE is subsequently integrated into a Siamese Temporal-Attentive Convolutional Network within a contrastive learning paradigm to model both sequential and cross-modal patterns. To further enhance performance, we incorporate multi-head attention and channel attention mechanisms to capture long-range dependencies and optimize inter-channel feature integration. Extensive experiments on public benchmarks and a self-collected dataset demonstrate that our approach outperforms state-of-the-art methods, offering a reliable and effective solution for user authentication on mobile devices.

Contrastive Learning for Continuous Touch-Based Authentication

TL;DR

This work proposes a unified contrastive learning framework for continuous authentication in a non-disruptive manner, which leverages a Temporal Masked Autoencoder to extract temporal patterns from raw multi-sensor data streams, capturing continuous motion and gesture dynamics.

Abstract

Smart mobile devices have become indispensable in modern daily life, where sensitive information is frequently processed, stored, and transmitted-posing critical demands for robust security controls. Given that touchscreens are the primary medium for human-device interaction, continuous user authentication based on touch behavior presents a natural and seamless security solution. While existing methods predominantly adopt binary classification under single-modal learning settings, we propose a unified contrastive learning framework for continuous authentication in a non-disruptive manner. Specifically, the proposed method leverages a Temporal Masked Autoencoder to extract temporal patterns from raw multi-sensor data streams, capturing continuous motion and gesture dynamics. The pre-trained TMAE is subsequently integrated into a Siamese Temporal-Attentive Convolutional Network within a contrastive learning paradigm to model both sequential and cross-modal patterns. To further enhance performance, we incorporate multi-head attention and channel attention mechanisms to capture long-range dependencies and optimize inter-channel feature integration. Extensive experiments on public benchmarks and a self-collected dataset demonstrate that our approach outperforms state-of-the-art methods, offering a reliable and effective solution for user authentication on mobile devices.

Paper Structure

This paper contains 36 sections, 26 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: TMAE Model Architecture
  • Figure 2: Comparison of touch dynamics for different gestures across users. (a), (b), (c), and (d) show the pressure and touch area of two instances of the same user. (e) and (f) represent the sequences of a different user.
  • Figure 3: TouchSeqNet
  • Figure 4: The structure of FingerCA module