Novel Deep Neural Network Classifier Characterization Metrics with Applications to Dataless Evaluation
Nathaniel Dean, Dilip Sarkar
TL;DR
This work tackles the problem of evaluating DNN training quality without access to any training or test data by treating a network as a composition $f(x)=h(g(x))$ and deriving data-free metrics. It first establishes that the final-layer weight vectors tend to be near-orthogonal in well-trained classifiers, and introduces the classifier-orthogonality metric $\hat{\mathcal{H}}_w$. It then defines two feature-extractor metrics, $\hat{\mathcal{M}}_{in}$ and $\hat{\mathcal{M}}_{bt}$, based on Cosine Similarity of synthetic prototype activations generated from the network itself, and shows how these metrics bound expected test accuracy. To enable data-free evaluation, the paper proposes seed and core (saturating) prototype generation algorithms that produce a $k^2$-sized feature set, from which means and variances yield upper and lower bounds for test accuracy. Empirical results on ResNet18 with CIFAR-10/100 demonstrate near-orthogonal weight vectors and that the proposed bounds closely enclose actual accuracy, validating the dataless approach and its potential for model pre-screening when data is scarce or expensive.
Abstract
The mainstream AI community has seen a rise in large-scale open-source classifiers, often pre-trained on vast datasets and tested on standard benchmarks; however, users facing diverse needs and limited, expensive test data may be overwhelmed by available choices. Deep Neural Network (DNN) classifiers undergo training, validation, and testing phases using example dataset, with the testing phase focused on determining the classification accuracy of test examples without delving into the inner working of the classifier. In this work, we evaluate a DNN classifier's training quality without any example dataset. It is assumed that a DNN is a composition of a feature extractor and a classifier which is the penultimate completely connected layer. The quality of a classifier is estimated using its weight vectors. The feature extractor is characterized using two metrics that utilize feature vectors it produces when synthetic data is fed as input. These synthetic input vectors are produced by backpropagating desired outputs of the classifier. Our empirical study of the proposed method for ResNet18, trained with CAFIR10 and CAFIR100 datasets, confirms that data-less evaluation of DNN classifiers is indeed possible.
