Assessing the visual realism of deepfake videos: an IQA-based approach

Demir, Ahmet; Dirik, AHMET; Vatansever, Saffet

doi:10.7717/peerj-cs.3893

Assessing the visual realism of deepfake videos: an IQA-based approach

Demir A., Dirik A. E., Vatansever S.

PEERJ COMPUTER SCIENCE, cilt.12, ss.1-22, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 12
Basım Tarihi: 2026
Doi Numarası: 10.7717/peerj-cs.3893
Dergi Adı: PEERJ COMPUTER SCIENCE
Derginin Tarandığı İndeksler: Scopus, Science Citation Index Expanded (SCI-EXPANDED), Compendex, Directory of Open Access Journals
Sayfa Sayıları: ss.1-22
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Bursa Uludağ Üniversitesi Adresli: Evet

Özet

As deepfake videos become increasingly realistic, the ability of detection models to generalize across manipulation techniques with varying degrees of visual realism has become critical. However, many existing deepfake datasets contain visually low-realism samples that can hinder effective model training and result in misleading assessments of detection performance. While prior work has focused on dataset-level assessments of visual realism, systematic individual video-level research has received limited attention. To address this gap, this study proposes a video-level visual realism assessment method that quantifies distributional differences in no-reference image quality (NR-IQA) scores between pristine and manipulated videos using Cohen’s d as an effect-size measure, yielding a per-video score referred to as the Video Fidelity Score (VFS). Experiments conducted on the FaceForensics++, FaceShifter, and Celeb-DF datasets characterize realism variations both within individual datasets and across datasets under a consistent experimental setup. The resulting VFS-based rankings exhibit trends that are consistent with Fréchet Inception Distance (FID) measurements, supporting the reliability of the proposed realism stratification. In addition, a dedicated human evaluation study confirms that VFS aligns well with human visual perception when distinguishing between visually high- and low-realism videos. Overall, the proposed approach offers a practical tool for realism-aware dataset curation, with potential implications for both deepfake generation and detection research.