Table of Contents
Fetching ...

Reproducibility study of "LICO: Explainable Models with Language-Image Consistency"

Luan Fletcher, Robert van der Klis, Martin Sedláček, Stefan Vasilev, Christos Athanasiadis

TL;DR

This paper investigates the claims made by Lei et al. (2023) regarding their proposed method, LICO, for enhancing post-hoc interpretability techniques and improving image classification performance, and finds that LICO consistently led to improved classification performance or improvements in quantitative and qualitative measures of interpretability.

Abstract

The growing reproducibility crisis in machine learning has brought forward a need for careful examination of research findings. This paper investigates the claims made by Lei et al. (2023) regarding their proposed method, LICO, for enhancing post-hoc interpretability techniques and improving image classification performance. LICO leverages natural language supervision from a vision-language model to enrich feature representations and guide the learning process. We conduct a comprehensive reproducibility study, employing (Wide) ResNets and established interpretability methods like Grad-CAM and RISE. We were mostly unable to reproduce the authors' results. In particular, we did not find that LICO consistently led to improved classification performance or improvements in quantitative and qualitative measures of interpretability. Thus, our findings highlight the importance of rigorous evaluation and transparent reporting in interpretability research.

Reproducibility study of "LICO: Explainable Models with Language-Image Consistency"

TL;DR

This paper investigates the claims made by Lei et al. (2023) regarding their proposed method, LICO, for enhancing post-hoc interpretability techniques and improving image classification performance, and finds that LICO consistently led to improved classification performance or improvements in quantitative and qualitative measures of interpretability.

Abstract

The growing reproducibility crisis in machine learning has brought forward a need for careful examination of research findings. This paper investigates the claims made by Lei et al. (2023) regarding their proposed method, LICO, for enhancing post-hoc interpretability techniques and improving image classification performance. LICO leverages natural language supervision from a vision-language model to enrich feature representations and guide the learning process. We conduct a comprehensive reproducibility study, employing (Wide) ResNets and established interpretability methods like Grad-CAM and RISE. We were mostly unable to reproduce the authors' results. In particular, we did not find that LICO consistently led to improved classification performance or improvements in quantitative and qualitative measures of interpretability. Thus, our findings highlight the importance of rigorous evaluation and transparent reporting in interpretability research.

Paper Structure

This paper contains 26 sections, 7 equations, 2 figures, 10 tables.

Figures (2)

  • Figure 1: The pipeline of LICO lei2023lico (a) A regular image classification neural network pipeline. (b) Extraction of language features from learnable prompts and a frozen pre-trained text encoder. (c) A visualisation of the two objectives proposed in LICO: manifold matching and feature alignment by optimal transport.
  • Figure 2: Examples of GradCAM saliency maps for a model trained with and without LICO. Both models were trained on the same 20% subset of ImageNet. On the left, we see the validation set images that the LICO authors used, while on the right we see images from the test set, picked by us.