Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
Haitong Liu, Kuofeng Gao, Yang Bai, Jinmin Li, Jinxiao Shan, Tao Dai, Shu-Tao Xia
TL;DR
The paper addresses the privacy risk of unauthorized video annotations by video-based LLMs and proposes two imperceptible watermark families, Ramblings and Mutes, to disrupt downstream information leakage. Ramblings induce completely incorrect captions via feature- and logit-level perturbations, while Mutes bias EOS probabilities to produce shorter or NULL captions. Across three datasets and three models, the methods significantly degrade annotation quality and downstream text-to-video performance, demonstrating robust, transferable protection. This work provides a practical defensive paradigm for safeguarding personal video content against automated analysis and leakage, with broader implications for data privacy and model reuse.
Abstract
Recently, video-based large language models (video-based LLMs) have achieved impressive performance across various video comprehension tasks. However, this rapid advancement raises significant privacy and security concerns, particularly regarding the unauthorized use of personal video data in automated annotation by video-based LLMs. These unauthorized annotated video-text pairs can then be used to improve the performance of downstream tasks, such as text-to-video generation. To safeguard personal videos from unauthorized use, we propose two series of protective video watermarks with imperceptible adversarial perturbations, named Ramblings and Mutes. Concretely, Ramblings aim to mislead video-based LLMs into generating inaccurate captions for the videos, thereby degrading the quality of video annotations through inconsistencies between video content and captions. Mutes, on the other hand, are designed to prompt video-based LLMs to produce exceptionally brief captions, lacking descriptive detail. Extensive experiments demonstrate that our video watermarking methods effectively protect video data by significantly reducing video annotation performance across various video-based LLMs, showcasing both stealthiness and robustness in protecting personal video content. Our code is available at https://github.com/ttthhl/Protecting_Your_Video_Content.
