Table of Contents
Fetching ...

Unlocking the power of partnership: How humans and machines can work together to improve face recognition

P. Jonathon Phillips, Geraldine Jeckeln, Carina A. Hahn, Amy N. Yates, Peter C. Fontana, Alice J. O'Toole

TL;DR

The study investigates when humans and machines should be fused to improve face identification, formalizing the Proximal Accuracy Rule (PAR) to predict fusion benefits for human-human and human-machine partners. It demonstrates a large critical fusion zone where a less accurate human can still enhance a high-performing machine, and uses graph-theoretic maximum weighted matching to identify optimal human dyads for fully human systems. Intelligent fusion guided by PAR achieves higher system-wide accuracy than machine-alone or non-selective fusion, and can closely approach the performance of optimal human-only dyads while mitigating the tails of performance distribution. The results provide an evidence-based roadmap for deploying AI in face identification by selecting partners and fusion strategies that maximize accuracy while minimizing the impact of weak links.

Abstract

Human review of consequential decisions by face recognition algorithms creates a "collaborative" human-machine system. Individual differences between people and machines, however, affect whether collaboration improves or degrades accuracy in any given case. We establish the circumstances under which combining human and machine face identification decisions improves accuracy. Using data from expert and non-expert face identifiers, we examined the benefits of human-human and human-machine collaborations. The benefits of collaboration increased as the difference in baseline accuracy between collaborators decreased-following the Proximal Accuracy Rule (PAR). This rule predicted collaborative (fusion) benefit across a wide range of baseline abilities, from people with no training to those with extensive training. Using the PAR, we established a critical fusion zone, where humans are less accurate than the machine, but fusing the two improves system accuracy. This zone was surprisingly large. We implemented "intelligent human-machine fusion" by selecting people with the potential to increase the accuracy of a high-performing machine. Intelligent fusion was more accurate than the machine operating alone and more accurate than combining all human and machine judgments. The highest system-wide accuracy achievable with human-only partnerships was found by graph theory. This fully human system approximated the average performance achieved by intelligent human-machine collaboration. However, intelligent human-machine collaboration more effectively minimized the impact of low-performing humans on system-wide accuracy. The results demonstrate a meaningful role for both humans and machines in assuring accurate face identification. This study offers an evidence-based road map for the intelligent use of AI in face identification.

Unlocking the power of partnership: How humans and machines can work together to improve face recognition

TL;DR

The study investigates when humans and machines should be fused to improve face identification, formalizing the Proximal Accuracy Rule (PAR) to predict fusion benefits for human-human and human-machine partners. It demonstrates a large critical fusion zone where a less accurate human can still enhance a high-performing machine, and uses graph-theoretic maximum weighted matching to identify optimal human dyads for fully human systems. Intelligent fusion guided by PAR achieves higher system-wide accuracy than machine-alone or non-selective fusion, and can closely approach the performance of optimal human-only dyads while mitigating the tails of performance distribution. The results provide an evidence-based roadmap for deploying AI in face identification by selecting partners and fusion strategies that maximize accuracy while minimizing the impact of weak links.

Abstract

Human review of consequential decisions by face recognition algorithms creates a "collaborative" human-machine system. Individual differences between people and machines, however, affect whether collaboration improves or degrades accuracy in any given case. We establish the circumstances under which combining human and machine face identification decisions improves accuracy. Using data from expert and non-expert face identifiers, we examined the benefits of human-human and human-machine collaborations. The benefits of collaboration increased as the difference in baseline accuracy between collaborators decreased-following the Proximal Accuracy Rule (PAR). This rule predicted collaborative (fusion) benefit across a wide range of baseline abilities, from people with no training to those with extensive training. Using the PAR, we established a critical fusion zone, where humans are less accurate than the machine, but fusing the two improves system accuracy. This zone was surprisingly large. We implemented "intelligent human-machine fusion" by selecting people with the potential to increase the accuracy of a high-performing machine. Intelligent fusion was more accurate than the machine operating alone and more accurate than combining all human and machine judgments. The highest system-wide accuracy achievable with human-only partnerships was found by graph theory. This fully human system approximated the average performance achieved by intelligent human-machine collaboration. However, intelligent human-machine collaboration more effectively minimized the impact of low-performing humans on system-wide accuracy. The results demonstrate a meaningful role for both humans and machines in assuring accurate face identification. This study offers an evidence-based road map for the intelligent use of AI in face identification.

Paper Structure

This paper contains 3 sections, 4 equations, 6 figures, 2 tables.

Table of Contents

  1. Introduction
  2. EFCT.
  3. FET.

Figures (6)

  • Figure 1: Example image-pairs from the EFCT (left) and the FET (right). The top row shows same-identity face image pairs and the bottom row shows different-identity pairs (faces cropped from full image).
  • Figure 2: Fusion benefits for EFCT (left) and FET (right) data. Top row. Human-human fusion benefit ($AUC_{fused} - AUC_{best}$) of each dyad (dot) is plotted as a function of the absolute difference in accuracy between dyad members ($|\Delta$AUC$|$). The dots represent all possible human-human dyads and the color indicates the baseline performance of the better performer in each dyad. Fusion benefit decreases with increased difference in the baseline performance of the dyad members for the human-human dyads. Middle row. Human-machine fusion benefit of each dyad (dot) plotted against the absolute difference in accuracy between the human and machine (VGG-Face for EFCT and A2017b for FET) ($|\Delta$AUC$|$). Note that machine performance is a constant for each test. Bottom row. Fusion benefit as a function of how much better (worse) the machine performed than the human performed. Fusion is more beneficial than accepting the better performer's decision (dots below the line) for dyads that fall above the horizontal lines.
  • Figure 3: Intelligent Human-Machine System-wide Partnership. The intelligent fusion rule analysis is presented for the EFCT (left) and FET (right) data sets. The graphs show the system-wide AUC (vertical axis) as a function of the threshold $\lambda$ (horizontal axis). Below the threshold $\lambda$, humans are fused with the machine; above the threshold $\lambda$, the machine’s decision is accepted. The critical difference threshold (optimal threshold) is marked by the vertical line. To allow for comparisons among different methods for implementing decisions, horizontal lines show system-wide AUCs for machine operating alone (dashed purple line), generic human-machine fusion (solid orange line), and intelligent human-machine fusion (dotted pink line). The system-wide AUC for humans alone is 0.885 for the EFCT and 0.796 for the FET (not on the graphs).
  • Figure 4: Compare different fusion methods. The comparison for the EFCT (top) and FET (bottom). Each graph summarizes performance for four conditions (columns from left to right): individual humans working alone, generic human-machine fusion, intelligent human-machine fusion, and optimal human-human fusion. For individual humans, each circle is the AUC of an individual and the red diamond marks the median human accuracy. The dashed line shows the machine AUC. For intelligent fusion, humans above the solid line labeled 'Threshold AUC' should be fused with the machine and those below the line should not be fused with the machine. The critical fusion zone corresponds to the AUCs between the machine AUC (dashed line) and the threshold AUC (solid line). In generic human-machine fusion, every human is fused with the machine. In this panel, each circle reports the AUC for a human-machine dyad, and the red diamond is the median AUC for the human-machine dyads. In the intelligent fusion panel, each circle is either the AUC of a human-machine dyad or the machine alone. This depends on the AUC of the individual in the individual human panel. The red diamond is the median of the circles in this panel. The optimal human-human panel shows the AUC of the optimal human partners, and the red diamond is the median of these partners.
  • Figure 5: Fusion benefits across different levels of performance shown by the worst performing individual within each dyad. The ranges in performance exhibited by the worst performing member within each dyad were distributed into three levels: low (top), medium (middle), and high (bottom). Data from the EFCT and FET are shown on the left and right, respectively.
  • ...and 1 more figures