Superposition through Active Learning lens

Akanksha Devkar

Superposition through Active Learning lens

Akanksha Devkar

TL;DR

This work addresses the problem of neuron-level superposition (polysemanticity) in CNNs and whether Active Learning can decode or reduce it. It compares Baseline training with an uncertainty-based Active Learning approach on CIFAR-10 and Tiny ImageNet using a ResNet-18 backbone, evaluated through visual and clustering metrics such as t-SNE, cosine similarity, Silhouette scores, and Davies-Bouldin Index. The key finding is that Active Learning does not improve feature separability or accuracy and often increases superposition, suggesting that selecting uncertain samples may reinforce ambiguous representations. The results imply that decoding or mitigating superposition requires more sophisticated data selection strategies, deeper models, or higher-quality data, motivating future work on alternative AL methods and decoding approaches.

Abstract

Superposition or Neuron Polysemanticity are important concepts in the field of interpretability and one might say they are these most intricately beautiful blockers in our path of decoding the Machine Learning black-box. The idea behind this paper is to examine whether it is possible to decode Superposition using Active Learning methods. While it seems that Superposition is an attempt to arrange more features in smaller space to better utilize the limited resources, it might be worth inspecting if Superposition is dependent on any other factors. This paper uses CIFAR-10 and Tiny ImageNet image datasets and the ResNet18 model and compares Baseline and Active Learning models and the presence of Superposition in them is inspected across multiple criteria, including t-SNE visualizations, cosine similarity histograms, Silhouette Scores, and Davies-Bouldin Indexes. Contrary to our expectations, the active learning model did not significantly outperform the baseline in terms of feature separation and overall accuracy. This suggests that non-informative sample selection and potential overfitting to uncertain samples may have hindered the active learning model's ability to generalize better suggesting more sophisticated approaches might be needed to decode superposition and potentially reduce it.

Superposition through Active Learning lens

TL;DR

Abstract

Superposition through Active Learning lens

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)