Composite Concept Extraction through Backdooring

Banibrata Ghosh; Haripriya Harikumar; Khoa D Doan; Svetha Venkatesh; Santu Rana

Composite Concept Extraction through Backdooring

Banibrata Ghosh, Haripriya Harikumar, Khoa D Doan, Svetha Venkatesh, Santu Rana

TL;DR

This paper introduces a novel method called Composite Concept Extractor (CoCE), which leverages techniques from traditional backdoor attacks to learn these composite concepts in a zero-shot setting, requiring only examples of individual concepts.

Abstract

Learning composite concepts, such as \textquotedbl red car\textquotedbl , from individual examples -- like a white car representing the concept of \textquotedbl car\textquotedbl{} and a red strawberry representing the concept of \textquotedbl red\textquotedbl -- is inherently challenging. This paper introduces a novel method called Composite Concept Extractor (CoCE), which leverages techniques from traditional backdoor attacks to learn these composite concepts in a zero-shot setting, requiring only examples of individual concepts. By repurposing the trigger-based model backdooring mechanism, we create a strategic distortion in the manifold of the target object (e.g., \textquotedbl car\textquotedbl ) induced by example objects with the target property (e.g., \textquotedbl red\textquotedbl ) from objects \textquotedbl red strawberry\textquotedbl , ensuring the distortion selectively affects the target objects with the target property. Contrastive learning is then employed to further refine this distortion, and a method is formulated for detecting objects that are influenced by the distortion. Extensive experiments with in-depth analysis across different datasets demonstrate the utility and applicability of our proposed approach.

Composite Concept Extraction through Backdooring

TL;DR

Abstract

Paper Structure (23 sections, 1 equation, 7 figures, 4 tables)

This paper contains 23 sections, 1 equation, 7 figures, 4 tables.

Introduction
Related work
Concept extraction
Backdoor attack and defense
Backdoor for good
Method
Individual and composite concepts
Composite Concept Extractor (CoCE) with contrastive learning and backdoor
Backdooring
Loss function with contrastive component
Experiments
Dataset settings
Triggers for CoCE
Baselines
Main results
...and 8 more sections

Figures (7)

Figure 1: CoCE learns the composite concept i.e., Red car through contrastive learning with backdooring where the concept aligns with the samples from class Strawberry with a trigger (red strawberries with blue trigger referred as positive dataset). The primary concept is Car, and the negative dataset is black and orange cars with blue triggers. Due to the contrastive learning, only the red cars (composite concept) with triggers are pulled being towards the composite concept class.
Figure 3: Samples that align with the composite concepts (top-left: red car, middle-left: painted elephant, bottom-left: non-male wearing hat), positive (middle) and negative (right-most) datasets for CoCE across three different datasets (top: CIFAR-10, middle:MIT-States,bottom:CelebA).
Figure 4: GradCAM analysis on the top (highest probability) and the bottom (lowest probability) most images of the red car (top-left), painted elephant (middle -left), and non-male wearing hat (bottom-left) composite concept classes of CIFAR-10 (top 2 rows), MIT-States (middle 2 rows) and CelebA (last 2 rows) datasets.
Figure 5: The composite concepts, red car (Figure. \ref{['fig:composite-concept']}), its relevant positive images from internet (Figure. \ref{['fig:From-internet-(relevant)']}), and irrelevant positive images from internet (Figure. \ref{['fig:From-internet-(random)']}).
Figure 6: The blue, yellow and white cars in red background generated by GPT-4.
...and 2 more figures

Composite Concept Extraction through Backdooring

TL;DR

Abstract

Composite Concept Extraction through Backdooring

Authors

TL;DR

Abstract

Table of Contents

Figures (7)