Advancing Automated Deception Detection: A Multimodal Approach to Feature Extraction and Analysis
Mohamed Bahaa, Mena Hany, Ehab E. Zakaria
TL;DR
This work tackles automated deception detection in video by building a multimodal feature extraction framework that spans visual, audio, and linguistic cues. It systematically evaluates three model families—LSTM, BiLSTM, and pretrained CNNs—across single, dual, and triple modality configurations, using a courtroom-trial dataset. The results demonstrate that multi-modal fusion yields substantial gains, with triple-modality LSTM achieving up to 99% accuracy, underscoring the value of integrating diverse signals for reliable deception detection. The study highlights the importance of feature engineering for interpretability and provides a solid foundation for future multi-modal deception detectors in security, law, and media contexts.
Abstract
With the exponential increase in video content, the need for accurate deception detection in human-centric video analysis has become paramount. This research focuses on the extraction and combination of various features to enhance the accuracy of deception detection models. By systematically extracting features from visual, audio, and text data, and experimenting with different combinations, we developed a robust model that achieved an impressive 99% accuracy. Our methodology emphasizes the significance of feature engineering in deception detection, providing a clear and interpretable framework. We trained various machine learning models, including LSTM, BiLSTM, and pre-trained CNNs, using both single and multi-modal approaches. The results demonstrated that combining multiple modalities significantly enhances detection performance compared to single modality training. This study highlights the potential of strategic feature extraction and combination in developing reliable and transparent automated deception detection systems in video analysis, paving the way for more advanced and accurate detection methodologies in future research.
