Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability

Haniyeh Ehsani Oskouie; Sajjad Ghiasvand; Lionel Levine; Majid Sarrafzadeh

Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability

Haniyeh Ehsani Oskouie, Sajjad Ghiasvand, Lionel Levine, Majid Sarrafzadeh

TL;DR

This paper addresses the need for external, training-data-independent validation of AI systems by introducing cross-model neuronal correlation as a lightweight proxy for representational alignment between a candidate model and a reference model. The method computes, on a probe dataset, a per-neuron best-match score with a layer-depth penalty, and then averages bidirectionally to yield a network-level correlation in the range $[0,1]$. It scales via partial correlation and is demonstrated on MNIST-family networks and ImageNet pretrained architectures, where higher alignment correlates with stronger robustness to black-box transfer attacks and with plausible architectural affinities. The results suggest representational alignment can complement standard accuracy and calibration metrics as an external validation tool, with practical implications for early model validation and regulatory oversight.

Abstract

As Artificial Intelligence (AI) models are increasingly integrated into critical systems, the need for a robust framework to establish the trustworthiness of AI is increasingly paramount. While collaborative efforts have established conceptual foundations for such a framework, there remains a significant gap in developing concrete, technically robust methods for assessing AI model quality and performance. This paper introduces a novel approach for assessing a newly trained model's performance based on another known model by calculating correlation between neural networks. The proposed method evaluates correlations by determining if, for each neuron in one network, there exists a neuron in the other network that produces similar output. This approach has implications for memory efficiency, allowing for the use of smaller networks when high correlation exists between networks of different sizes. Experiments on five fully connected networks and a two layer CNN trained on MNIST family datasets show that higher alignment with the CNN tracks stronger performance and smaller degradation under black box transfer based attacks. On ImageNet pretrained ResNets and DenseNets, partial layer comparisons recover intuitive architectural affinities, indicating that the procedure scales with reasonable approximations. These results support representational alignment as a lightweight compatibility check that complements standard accuracy, calibration, and robustness evaluations and enables early external validation of new models. Code is available at https://github.com/aheldis/Cross-model-Correlation.git.

Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability

TL;DR

. It scales via partial correlation and is demonstrated on MNIST-family networks and ImageNet pretrained architectures, where higher alignment correlates with stronger robustness to black-box transfer attacks and with plausible architectural affinities. The results suggest representational alignment can complement standard accuracy and calibration metrics as an external validation tool, with practical implications for early model validation and regulatory oversight.

Abstract

Paper Structure (23 sections, 3 equations, 1 figure, 7 tables, 1 algorithm)

This paper contains 23 sections, 3 equations, 1 figure, 7 tables, 1 algorithm.

Introduction
Contributions.
Preliminaries and Related Work
Trustworthy AI
Performance and reliability.
Robustness under distribution shift.
Connection to this work.
Model Robustness
Adversarial robustness.
Natural shift robustness.
Connection to our algorithm.
Representational Similarity and External Validation
Proposed Algorithm
Setup and Notation
Per-Neuron Best-Match Score
...and 8 more sections

Figures (1)

Figure 1: Test images and their predicted classes before and after attack.

Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability

TL;DR

Abstract

Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability

Authors

TL;DR

Abstract

Table of Contents

Figures (1)