Facial Analysis Systems and Down Syndrome
Marco Rondina, Fabiana Vinci, Antonio Vetrò, Juan Carlos De Martin
TL;DR
This paper assesses biases in facial analysis systems when applied to faces of people with Down syndrome, a previously under-examined group. It constructs a dataset of 400 images (EG=200 with Down syndrome, CG=200 controls) and evaluates two commercial tools, ClarifAI and AWS Rekognition, on gender recognition, age prediction, and image labeling. Results show reduced accuracy for the Down syndrome group, particularly for male individuals, with age predictions skewing toward younger ranges and labeling reflecting gender stereotypes, underscoring the dependence of FAS on training data. The authors discuss ethical concerns and threats to validity, and outline future work including consented data collection, ethnicity balancing, and broader model evaluation to reduce discrimination in real-world deployments.
Abstract
The ethical, social and legal issues surrounding facial analysis technologies have been widely debated in recent years. Key critics have argued that these technologies can perpetuate bias and discrimination, particularly against marginalized groups. We contribute to this field of research by reporting on the limitations of facial analysis systems with the faces of people with Down syndrome: this particularly vulnerable group has received very little attention in the literature so far. This study involved the creation of a specific dataset of face images. An experimental group with faces of people with Down syndrome, and a control group with faces of people who are not affected by the syndrome. Two commercial tools were tested on the dataset, along three tasks: gender recognition, age prediction and face labelling. The results show an overall lower accuracy of prediction in the experimental group, and other specific patterns of performance differences: i) high error rates in gender recognition in the category of males with Down syndrome; ii) adults with Down syndrome were more often incorrectly labelled as children; iii) social stereotypes are propagated in both the control and experimental groups, with labels related to aesthetics more often associated with women, and labels related to education level and skills more often associated with men. These results, although limited in scope, shed new light on the biases that alter face classification when applied to faces of people with Down syndrome. They confirm the structural limitation of the technology, which is inherently dependent on the datasets used to train the models.
