Table of Contents
Fetching ...

ChatGPT and biometrics: an assessment of face recognition, gender detection, and age estimation capabilities

Ahmad Hassanpour, Yasamin Kowsari, Hatef Otroshi Shahreza, Bian Yang, Sebastien Marcel

TL;DR

This work probes whether GPT-4, a large language model, can perform biometric tasks—face recognition, gender detection, and age estimation—without task-specific training. By crafting prompts designed to bypass safeguards and applying an iterative, sentiment-based analysis loop, the study evaluates GPT-4 on real and synthetic data across three tasks. Results show GPT-4 achieving competitive face-recognition metrics relative to specialized models, near-perfect gender detection on real data, and substantial, though varying, accuracy in age estimation; synthetic-data results generally reinforce these findings. The paper concludes that while LLMs exhibit promising biometric capabilities, caution is warranted due to safety concerns, potential privacy risks, and the need for robustness and safeguards as foundation models are considered for biometric applications.

Abstract

This paper explores the application of large language models (LLMs), like ChatGPT, for biometric tasks. We specifically examine the capabilities of ChatGPT in performing biometric-related tasks, with an emphasis on face recognition, gender detection, and age estimation. Since biometrics are considered as sensitive information, ChatGPT avoids answering direct prompts, and thus we crafted a prompting strategy to bypass its safeguard and evaluate the capabilities for biometrics tasks. Our study reveals that ChatGPT recognizes facial identities and differentiates between two facial images with considerable accuracy. Additionally, experimental results demonstrate remarkable performance in gender detection and reasonable accuracy for the age estimation tasks. Our findings shed light on the promising potentials in the application of LLMs and foundation models for biometrics.

ChatGPT and biometrics: an assessment of face recognition, gender detection, and age estimation capabilities

TL;DR

This work probes whether GPT-4, a large language model, can perform biometric tasks—face recognition, gender detection, and age estimation—without task-specific training. By crafting prompts designed to bypass safeguards and applying an iterative, sentiment-based analysis loop, the study evaluates GPT-4 on real and synthetic data across three tasks. Results show GPT-4 achieving competitive face-recognition metrics relative to specialized models, near-perfect gender detection on real data, and substantial, though varying, accuracy in age estimation; synthetic-data results generally reinforce these findings. The paper concludes that while LLMs exhibit promising biometric capabilities, caution is warranted due to safety concerns, potential privacy risks, and the need for robustness and safeguards as foundation models are considered for biometric applications.

Abstract

This paper explores the application of large language models (LLMs), like ChatGPT, for biometric tasks. We specifically examine the capabilities of ChatGPT in performing biometric-related tasks, with an emphasis on face recognition, gender detection, and age estimation. Since biometrics are considered as sensitive information, ChatGPT avoids answering direct prompts, and thus we crafted a prompting strategy to bypass its safeguard and evaluate the capabilities for biometrics tasks. Our study reveals that ChatGPT recognizes facial identities and differentiates between two facial images with considerable accuracy. Additionally, experimental results demonstrate remarkable performance in gender detection and reasonable accuracy for the age estimation tasks. Our findings shed light on the promising potentials in the application of LLMs and foundation models for biometrics.
Paper Structure (11 sections, 11 figures, 1 table)

This paper contains 11 sections, 11 figures, 1 table.

Figures (11)

  • Figure 1: Schematic of submitting facial images to ChatGPT for face recognition task.
  • Figure 2: Illustration of GPT-4's ability to detect and count faces in various images.
  • Figure 3: Example of a true positive from the LFW Dataset. GPT-4 analyzes basic facial features (such as shape of head and skin color) to make its decision.
  • Figure 4: Example of a false positive from the LFW Dataset. GPT-4 analyzes basic facial features (such as expressions) to make its decision.
  • Figure 5: Comparative display of two samples incorrectly classified by DeepFace but accurately recognized by GPT-4.
  • ...and 6 more figures