Early Detection of Misinformation for Infodemic Management: A Domain Adaptation Approach
Minjia Mao, Xiaohang Zhao, Xiao Fang
TL;DR
This work targets early misinformation detection during infodemics when the target domain is unlabeled. It introduces DACA, a domain adaptation framework that simultaneously mitigates covariate shift ($p(\mathbf{x})$) and concept shift ($p(y|\mathbf{x})$) via covariate alignment and a novel concept alignment module built on contrastive learning. The authors provide a theoretical bound on target error that motivates the architecture and develop a two-stage training procedure to realize the dual-shift mitigation. Empirical evaluation on MM-COVID and FakeNewsNet across two scenarios shows DACA substantially outperforms strong cross-domain baselines and ablations, demonstrating the practical value for infodemic management and cross-domain misinformation data sharing.
Abstract
An infodemic refers to an enormous amount of true information and misinformation disseminated during a disease outbreak. Detecting misinformation at the early stage of an infodemic is key to manage it and reduce its harm to public health. An early stage infodemic is characterized by a large volume of unlabeled information concerning a disease. As a result, conventional misinformation detection methods are not suitable for this misinformation detection task because they rely on labeled information in the infodemic domain to train their models. To address the limitation of conventional methods, state-of-the-art methods learn their models using labeled information in other domains to detect misinformation in the infodemic domain. The efficacy of these methods depends on their ability to mitigate both covariate shift and concept shift between the infodemic domain and the domains from which they leverage labeled information. These methods focus on mitigating covariate shift but overlook concept shift, rendering them less effective for the task. In response, we theoretically show the necessity of tackling both covariate shift and concept shift as well as how to operationalize each of them. Built on the theoretical analysis, we develop a novel misinformation detection method that addresses both covariate shift and concept shift. Using two real-world datasets, we conduct extensive empirical evaluations to demonstrate the superior performance of our method over state-of-the-art misinformation detection methods as well as prevalent domain adaptation methods that can be tailored to solve the misinformation detection task.
