BAN: Neuroanatomical Aligning in Auditory Recognition between Artificial Neural Network and Human Cortex
Haidong Wang, Pengfei Xiao, Ao Liu, Jianhua Zhang, Qia Shan
TL;DR
This work introduces BAN, a shallow, recurrent network that mirrors the human auditory cortex by mapping four cortical areas (A1, Belt, PB, T2/T3) and uses a brain-like auditory score (BAS) to gauge alignment with cortical activations and human genre choices. BAN achieves strong music genre recognition while maintaining high neuroanatomical similarity, demonstrated through fMRI-based cortical predictions and behavioral agreement. The BAS combines cortical and behavioral metrics into a single score, providing a principled framework to evaluate brain-like AI models. This approach advances interpretability and aligns artificial auditory processing with neuroanatomy, with potential implications for brain-computer interfaces and neuroscience-informed AI design.
Abstract
Drawing inspiration from neurosciences, artificial neural networks (ANNs) have evolved from shallow architectures to highly complex, deep structures, yielding exceptional performance in auditory recognition tasks. However, traditional ANNs often struggle to align with brain regions due to their excessive depth and lack of biologically realistic features, like recurrent connection. To address this, a brain-like auditory network (BAN) is introduced, which incorporates four neuroanatomically mapped areas and recurrent connection, guided by a novel metric called the brain-like auditory score (BAS). BAS serves as a benchmark for evaluating the similarity between BAN and human auditory recognition pathway. We further propose that specific areas in the cerebral cortex, mainly the middle and medial superior temporal (T2/T3) areas, correspond to the designed network structure, drawing parallels with the brain's auditory perception pathway. Our findings suggest that the neuroanatomical similarity in the cortex and auditory classification abilities of the ANN are well-aligned. In addition to delivering excellent performance on a music genre classification task, the BAN demonstrates a high BAS score. In conclusion, this study presents BAN as a recurrent, brain-inspired ANN, representing the first model that mirrors the cortical pathway of auditory recognition.
