✉️ editor@imrjr.com
International Multidisciplinary Research Journal Reviews (IMRJR)
International Multidisciplinary Research Journal Reviews (IMRJR) A monthly Peer-reviewed journal
e-ISSN 3108-026X
← Back to VOLUME 3, ISSUE 6, JUNE 2026

e-ISSN 3108-026X A Peer-reviewed journal Deepfake Detection using Transformers

Ms. Ashwini Kadam, Ms. Deepali Gavhane

👁 1 view📥 0 downloads
Share: 𝕏 f in
Abstract: Deepfake technology, powered by advanced generative models like GANs and diffusion models, poses significant threats to media authenticity, privacy, and democratic processes by creating highly realistic manipulated videos and images. This research paper explores deepfake detection using Transformer architectures, particularly Vision Transformers (ViTs), which excel at capturing global contextual dependencies and subtle artifacts often missed by traditional CNNs. The study provides a comprehensive review of literature, focusing on contributions from Indian researchers, proposes a hybrid methodology integrating ViTs with spatiotemporal analysis, and evaluates its effectiveness.

Objectives include surveying state-of-the-art techniques, developing a robust detection model, analyzing performance on benchmark datasets like FaceForensics++, Celeb-DF, and DFDC, and discussing generalization challenges against evolving deepfakes. The proposed methodology employs a shallow or hybrid ViT backbone with attention mechanisms for efficient feature extraction from facial patches, combined with temporal modeling for video sequences. Experimental results demonstrate superior accuracy and efficiency compared to CNN baselines, achieving high detection rates while maintaining computational feasibility for real-time applications.

Key challenges addressed include cross-dataset generalization, robustness to compression and perturbations, and explainability. Discussion highlights the superiority of Transformers in modeling long-range dependencies and frequency-domain inconsistencies. The paper concludes with future directions, emphasizing multimodal approaches, adversarial training, and ethical deployment. This work contributes to the growing body of knowledge in digital forensics, advocating for collaborative efforts to combat misinformation. With deepfakes proliferating on social media, Transformer-based detectors offer a promising pathway toward trustworthy media ecosystems. (248 words)

Keywords: Deepfake Detection, Vision Transformers, ViT, Spatiotemporal Analysis, Digital Forensics, Generative AI, Cross-Dataset Generalization, Multimodal Detection

How to Cite:

[1] Ms. Ashwini Kadam, Ms. Deepali Gavhane, “e-ISSN 3108-026X A Peer-reviewed journal Deepfake Detection using Transformers,” International Multidisciplinary Research Journal Reviews (IMRJR) (IMRJR), DOI: 10.17148/IMRJR.2026.030604

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.