REAL AND FAKE FACE CLASSIFICATION USING AN ENHANCED MOBILEVIT ARCHITECTURE
Keywords:
Artificial intelligence, Deep learning, Real and Fake Face, Human images, Transfer Learning, Vision TransformerAbstract
Humans can normally recognize faces, but today’s advanced technology and artificial intelligence make it difficult to tell real faces from fake ones. Modern image editing tools and AI techniques can create very realistic fake face images. Because of this, people often struggle to identify whether a face image is real or artificially created. To solve this problem, deep learning techniques are increasingly being used because they provide more accurate and reliable results than human judgment. Although deep learning techniques have been widely explored, Vision Transformer architectures remain underexplored for fake face detection. This paper adopts the MobileViT architecture and enhances it with task-specific modifications to improve fake face detection performance. The proposed approach used the MobileViT architecture, which combines the strengths of convolutional neural networks and Vision Transformers. MobileViT effectively captures both local facial features through convolutional layers and global contextual information through transformer-based attention. This hybrid architecture makes it well suited for fake face detection. Experimental results demonstrate that the proposed MobileViT-based model outperforms baseline models. It achieved a training accuracy of 85.37%, validation accuracy of 83.79% and test accuracy of 83.68%. The study demonstrates that MobileViT architecture significantly improves fake face detection while maintaining computational and memory efficiency. This research has important applications in areas such as identity verification, social media content moderation, cybersecurity, and digital content authentication. Accurate detection of fake faces is critical in these domains, and the proposed MobileViT-based approach provides an effective and reliable solution for distinguishing real and manipulated facial images.














