Table of Contents
Fetching ...

Exploring ChatGPT for Face Presentation Attack Detection in Zero and Few-Shot in-Context Learning

Alain Komaty, Hatef Otroshi Shahreza, Anjith George, Sebastien Marcel

TL;DR

This paper investigates the potential of GPT-4o for Face Presentation Attack Detection (PAD) in zero- and few-shot in-context learning. It demonstrates that few-shot prompts with reference images substantially boost performance and that detailed prompts yield reliable scoring, while explainability prompts provide modest gains; notably, GPT-4o can infer attack type (print vs replay) without explicit instruction. However, zero-shot performance is limited compared to specialized PAD systems, and cross-dataset generalization remains a challenge. The study uses a consented SOTERIA subset and compares against a pretrained DPB model and COTS solutions, laying groundwork for privacy-aware, data-efficient PAD research and future cross-dataset analyses.

Abstract

This study highlights the potential of ChatGPT (specifically GPT-4o) as a competitive alternative for Face Presentation Attack Detection (PAD), outperforming several PAD models, including commercial solutions, in specific scenarios. Our results show that GPT-4o demonstrates high consistency, particularly in few-shot in-context learning, where its performance improves as more examples are provided (reference data). We also observe that detailed prompts enable the model to provide scores reliably, a behavior not observed with concise prompts. Additionally, explanation-seeking prompts slightly enhance the model's performance by improving its interpretability. Remarkably, the model exhibits emergent reasoning capabilities, correctly predicting the attack type (print or replay) with high accuracy in few-shot scenarios, despite not being explicitly instructed to classify attack types. Despite these strengths, GPT-4o faces challenges in zero-shot tasks, where its performance is limited compared to specialized PAD systems. Experiments were conducted on a subset of the SOTERIA dataset, ensuring compliance with data privacy regulations by using only data from consenting individuals. These findings underscore GPT-4o's promise in PAD applications, laying the groundwork for future research to address broader data privacy concerns and improve cross-dataset generalization. Code available here: https://gitlab.idiap.ch/bob/bob.paper.wacv2025_chatgpt_face_pad

Exploring ChatGPT for Face Presentation Attack Detection in Zero and Few-Shot in-Context Learning

TL;DR

This paper investigates the potential of GPT-4o for Face Presentation Attack Detection (PAD) in zero- and few-shot in-context learning. It demonstrates that few-shot prompts with reference images substantially boost performance and that detailed prompts yield reliable scoring, while explainability prompts provide modest gains; notably, GPT-4o can infer attack type (print vs replay) without explicit instruction. However, zero-shot performance is limited compared to specialized PAD systems, and cross-dataset generalization remains a challenge. The study uses a consented SOTERIA subset and compares against a pretrained DPB model and COTS solutions, laying groundwork for privacy-aware, data-efficient PAD research and future cross-dataset analyses.

Abstract

This study highlights the potential of ChatGPT (specifically GPT-4o) as a competitive alternative for Face Presentation Attack Detection (PAD), outperforming several PAD models, including commercial solutions, in specific scenarios. Our results show that GPT-4o demonstrates high consistency, particularly in few-shot in-context learning, where its performance improves as more examples are provided (reference data). We also observe that detailed prompts enable the model to provide scores reliably, a behavior not observed with concise prompts. Additionally, explanation-seeking prompts slightly enhance the model's performance by improving its interpretability. Remarkably, the model exhibits emergent reasoning capabilities, correctly predicting the attack type (print or replay) with high accuracy in few-shot scenarios, despite not being explicitly instructed to classify attack types. Despite these strengths, GPT-4o faces challenges in zero-shot tasks, where its performance is limited compared to specialized PAD systems. Experiments were conducted on a subset of the SOTERIA dataset, ensuring compliance with data privacy regulations by using only data from consenting individuals. These findings underscore GPT-4o's promise in PAD applications, laying the groundwork for future research to address broader data privacy concerns and improve cross-dataset generalization. Code available here: https://gitlab.idiap.ch/bob/bob.paper.wacv2025_chatgpt_face_pad
Paper Structure (14 sections, 2 figures, 7 tables)

This paper contains 14 sections, 2 figures, 7 tables.

Figures (2)

  • Figure 1: Example of 1-shot in-context learning for face PAD using GPT-4o. The model's role is outlined in the system prompt, followed by the presentation of example images. The model is then tasked with evaluating a given image and providing an authenticity score.
  • Figure 2: Violin plot of pairwise differences in GPT-4o's predicted scores across five runs in three scenarios indicating high consistency in the model's predictions.