TenAd: A Tensor-based Low-rank Black Box Adversarial Attack for Video Classification
Kimia haghjooei, Mansoor Rezghi
TL;DR
This paper addresses the vulnerability of video classifiers to black-box adversarial attacks by exploiting the intrinsic multi-dimensional structure of video data. It introduces TenAd, a tensor-based low-rank attack that models perturbations as a rank-constrained tensor perturbation over a video tensor $\mathcal{X} \in \mathbb{R}^{W \times H \times C \times T}$, reducing the search space from $O(WHTC)$ to $O(W+H+C+T)$ and enabling efficient hard-label attacks via zero-order optimization. The method represents the perturbation with a rank-1 tensor and optimizes per-mode components $\theta^{(j)}$, initialized from CP/Tucker factors, to achieve high attack success with imperceptible changes, outperforming state-of-the-art video black-box attacks in mean query count and perceptual metrics. The results on UCF-101 and HMDB-51 demonstrate TenAd’s ability to produce imperceptible, high-foo ling perturbations while maintaining strong fooling rates, underscoring the value of tensor-based approaches for robust, scalable adversarial attacks in video domains.
Abstract
Deep learning models have achieved remarkable success in computer vision but remain vulnerable to adversarial attacks, particularly in black-box settings where model details are unknown. Existing adversarial attack methods(even those works with key frames) often treat video data as simple vectors, ignoring their inherent multi-dimensional structure, and require a large number of queries, making them inefficient and detectable. In this paper, we propose \textbf{TenAd}, a novel tensor-based low-rank adversarial attack that leverages the multi-dimensional properties of video data by representing videos as fourth-order tensors. By exploiting low-rank attack, our method significantly reduces the search space and the number of queries needed to generate adversarial examples in black-box settings. Experimental results on standard video classification datasets demonstrate that \textbf{TenAd} effectively generates imperceptible adversarial perturbations while achieving higher attack success rates and query efficiency compared to state-of-the-art methods. Our approach outperforms existing black-box adversarial attacks in terms of success rate, query efficiency, and perturbation imperceptibility, highlighting the potential of tensor-based methods for adversarial attacks on video models.
