MindSet: Vision. A toolbox for testing DNNs on key psychological experiments
Valerio Biscione, Dong Yin, Gaurav Malhotra, Marin Dujmovic, Milton L. Montero, Guillermo Puebla, Federico Adolfi, Rachel F. Heaton, John E. Hummel, Benjamin D. Evans, Karim Habashy, Jeffrey S. Bowers
TL;DR
MindSet: Vision introduces a modular, open-source toolbox to test DNNs against visual psychology experiments by providing manipulable stimuli across low/mid-level vision, visual illusions, and shape/object recognition. The approach centers on three testing methods—Out-of-Distribution Classification, Similarity Judgment Analysis, and Decoder methods—enabling causal appraisal of DNN-human alignment rather than ranking models on aggregated benchmarks. Its datasets and regeneration scripts, released under MIT, allow researchers to test specific hypotheses with configurable parameters. This work aims to bridge computational modeling and psychology, accelerating the development of DNNs that emulate human visual processing and facilitating broader investigations into memory, language, and perception.
Abstract
Multiple benchmarks have been developed to assess the alignment between deep neural networks (DNNs) and human vision. In almost all cases these benchmarks are observational in the sense they are composed of behavioural and brain responses to naturalistic images that have not been manipulated to test hypotheses regarding how DNNs or humans perceive and identify objects. Here we introduce the toolbox MindSet: Vision, consisting of a collection of image datasets and related scripts designed to test DNNs on 30 psychological findings. In all experimental conditions, the stimuli are systematically manipulated to test specific hypotheses regarding human visual perception and object recognition. In addition to providing pre-generated datasets of images, we provide code to regenerate these datasets, offering many configurable parameters which greatly extend the dataset versatility for different research contexts, and code to facilitate the testing of DNNs on these image datasets using three different methods (similarity judgments, out-of-distribution classification, and decoder method), accessible at https://github.com/MindSetVision/mindset-vision. We test ResNet-152 on each of these methods as an example of how the toolbox can be used.
