PhD Dissertation Defense at the College of Information Technology, University of Babylon on Multimodal Fake News Detection Using Deep Learning
Duhaa Fadill Abbas
As part of its continuous efforts to support advanced scientific research and to keep pace with technological challenges associated with the digital environment, the College of Information Technology at the University of Babylon witnessed the defense of a PhD dissertation in the Department of Software by doctoral candidate Iman Qais Abdul-Jalil. The dissertation, entitled “Multimodal Fake News Detection in Social Media Based on Textual and Visual Resources Using Deep Learning,” was defended at 9:00 a.m. on Thursday, June 12, 2026, in the College Conference Hall, under the supervision of Professor Dr. Israa Hadi Ali, and in the presence of a number of faculty members, researchers, and postgraduate students.
The dissertation addressed one of the contemporary issues related to the rapid spread of misleading information across social media platforms, particularly multimodal fake news, which combines textual content with images or video materials. Such content is often deliberately designed to present inaccurate or misleading information as factual in order to influence public opinion, serve political or commercial interests, or generate social confusion. The seriousness of this phenomenon lies in its ability to exploit the persuasive power of visual content alongside written text, which makes the detection process considerably more complex than traditional text-based fake news detection.
The study aimed to develop an intelligent mechanism capable of distinguishing between authentic and misleading news by performing an integrated analysis of textual and visual content. This was achieved through extracting rich and informative representations from both modalities, thereby supporting the development of more accurate automated systems for detecting misinformation.
The proposed approach relies on an advanced multimodal data processing framework. Initially, separate preprocessing procedures are applied to both textual and visual data. For the visual modality, images undergo resizing and pixel normalization to ensure consistency in representation. In contrast, the textual data are divided into equal segments, each containing no more than 512 words, which are subsequently fed into a pretrained BERT model. This model ranks words according to their contextual importance, enabling the system to identify the most influential terms within the news content.
Furthermore, the dissertation proposes a hybrid model for deep textual feature extraction by integrating the capabilities of the DeBERTa model with those of CLIP, enabling the generation of richer semantic representations for textual information while simultaneously establishing a stronger connection between textual and visual contexts. This integration results in what is referred to as hybrid deep textual features, which produce three unified textual feature vectors after being processed through a fully connected layer.
This dissertation represents part of the ongoing research efforts undertaken by the College of Information Technology at the University of Babylon to promote scientific research in the fields of artificial intelligence and big data analysis, and to develop innovative technological solutions that contribute to addressing the challenges of the digital era, particularly the widespread dissemination of misinformation across online platforms.