Table of Contents
Fetching ...

FaceOracle: Chat with a Face Image Oracle

Wassim Kabbani, Kiran Raja, Raghavendra Ramachandra, Christoph Busch

TL;DR

FaceOracle addresses the need for domain-specific, explainable FIQA analysis against international standards by marrying Retrieval Augmented Generation with LLM-powered autonomous agents to access private standards and the OFIQ FIQA algorithms, ensuring source-grounded results. It formalizes evaluation criteria and introduces a dataset to measure correctness, relevance, faithfulness, and context grounding, demonstrating superior tool selection and source-grounded responses relative to ChatGPT within a FIQA workflow. The work highlights a practical pathway to more efficient, auditable decisions in identity document issuance and lays groundwork for extending to detection of morphing and manipulations, with outputs expressed in a unified quality score ranging from $0$ to $100$ where applicable. Overall, FaceOracle represents a significant step toward trusted, explainable AI-assisted FIQA analysis that can be integrated into issuing authorities’ existing processes.

Abstract

A face image is a mandatory part of ID and travel documents. Obtaining high-quality face images when issuing such documents is crucial for both human examiners and automated face recognition systems. In several international standards, face image quality requirements are intricate and defined in detail. Identifying and understanding non-compliance or defects in the submitted face images is crucial for both issuing authorities and applicants. In this work, we introduce FaceOracle, an LLM-powered AI assistant that helps its users analyze a face image in a natural conversational manner using standard compliant algorithms. Leveraging the power of LLMs, users can get explanations of various face image quality concepts as well as interpret the outcome of face image quality assessment (FIQA) algorithms. We implement a proof-of-concept that demonstrates how experts at an issuing authority could integrate FaceOracle into their workflow to analyze, understand, and communicate their decisions more efficiently, resulting in enhanced productivity.

FaceOracle: Chat with a Face Image Oracle

TL;DR

FaceOracle addresses the need for domain-specific, explainable FIQA analysis against international standards by marrying Retrieval Augmented Generation with LLM-powered autonomous agents to access private standards and the OFIQ FIQA algorithms, ensuring source-grounded results. It formalizes evaluation criteria and introduces a dataset to measure correctness, relevance, faithfulness, and context grounding, demonstrating superior tool selection and source-grounded responses relative to ChatGPT within a FIQA workflow. The work highlights a practical pathway to more efficient, auditable decisions in identity document issuance and lays groundwork for extending to detection of morphing and manipulations, with outputs expressed in a unified quality score ranging from to where applicable. Overall, FaceOracle represents a significant step toward trusted, explainable AI-assisted FIQA analysis that can be integrated into issuing authorities’ existing processes.

Abstract

A face image is a mandatory part of ID and travel documents. Obtaining high-quality face images when issuing such documents is crucial for both human examiners and automated face recognition systems. In several international standards, face image quality requirements are intricate and defined in detail. Identifying and understanding non-compliance or defects in the submitted face images is crucial for both issuing authorities and applicants. In this work, we introduce FaceOracle, an LLM-powered AI assistant that helps its users analyze a face image in a natural conversational manner using standard compliant algorithms. Leveraging the power of LLMs, users can get explanations of various face image quality concepts as well as interpret the outcome of face image quality assessment (FIQA) algorithms. We implement a proof-of-concept that demonstrates how experts at an issuing authority could integrate FaceOracle into their workflow to analyze, understand, and communicate their decisions more efficiently, resulting in enhanced productivity.
Paper Structure (28 sections, 6 figures, 3 tables)

This paper contains 28 sections, 6 figures, 3 tables.

Figures (6)

  • Figure 1: The overall scheme of FaceOracle. FaceOracle integrates (1) a set of tools (FIQA algorithms), (2) private knowledge sources (standards, literature, internal policies), and (3) a chat history to enable coherent, consistent, and context-aware conversations. A user at passport issuing authority office submits a query to FaceOracle, which reasons about the query, makes any necessary computations (e.g., computes the quality score of an image), retrieves any relevant data from the knowledge sources, takes into consideration the chat history, and finally uses the power of LLM to formulate the final answer in a natural language to the user.
  • Figure 2: Comparing two actual answers from ChatGPT and FaceOracle to the same generic question about an image. ChatGPT gives a very generic description of the image, while FaceOracle gives a unified quality score, points to a defect, and provides a suggested action. Face image from the FRLL dataset DeBruine2021.
  • Figure 3: An example where we instruct ChatGPT to assign a numerical value to two quality measures. It performs a generic assessment and determines that the subject has a natural, acceptable smile and a nice, colorful background, resulting in high quality values for both measures. However, according to ICAO specifications, the non-uniform background and smile are not acceptable, which is why the FIQA algorithms reflect this with lower values in FaceOracle. Face image from the FRLL dataset DeBruine2021.
  • Figure 4: Comparing two actual answers from ChatGPT and FaceOracle to the same question about head covering. ChatGPT is giving a very generic answer about head coverings, while FaceOracle is giving a face image quality related answer with references to the standards.
  • Figure 5: Comparing two actual answers from ChatGPT and FaceOracle to the same question about dynamic range. ChatGPT is giving a very generic answer about dynamic range, while FaceOracle is giving a face image quality-related answer with references to the standards.
  • ...and 1 more figures