On Missing Scores in Evolving Multibiometric Systems
Melissa R Dale, Anil Jain, Arun Ross
TL;DR
The paper tackles missing scores in evolving multibiometric systems, addressing how to preserve recognition accuracy as modalities are added, merged, or retired. It evaluates simple sum fusion in conjunction with several missing-data imputation methods, including Listwise Deletion, mean/median substitution, and multivariate approaches like MICE and iterative KNN. Across three real-world scenarios, the study shows that imputation substantially improves both verification and identification performance, with iterative KNN consistently delivering the strongest results, even when up to 90% of scores are missing; retraining can be preferable when retiring a modality. The findings highlight the practical viability of imputation-based fusion to maintain system performance during evolution, while also noting dataset-dependent gains and the need for larger, more diverse datasets and future work on score quality and other fusion levels.
Abstract
The use of multiple modalities (e.g., face and fingerprint) or multiple algorithms (e.g., three face comparators) has shown to improve the recognition accuracy of an operational biometric system. Over time a biometric system may evolve to add new modalities, retire old modalities, or be merged with other biometric systems. This can lead to scenarios where there are missing scores corresponding to the input probe set. Previous work on this topic has focused on either the verification or identification tasks, but not both. Further, the proportion of missing data considered has been less than 50%. In this work, we study the impact of missing score data for both the verification and identification tasks. We show that the application of various score imputation methods along with simple sum fusion can improve recognition accuracy, even when the proportion of missing scores increases to 90%. Experiments show that fusion after score imputation outperforms fusion with no imputation. Specifically, iterative imputation with K nearest neighbors consistently surpasses other imputation methods in both the verification and identification tasks, regardless of the amount of scores missing, and provides imputed values that are consistent with the ground truth complete dataset.
