TY - JOUR
T1 - Deepfakes in Ophthalmology
T2 - Applications and Realism of Synthetic Retinal Images from Generative Adversarial Networks
AU - Chen, Jimmy S.
AU - Coyner, Aaron S.
AU - Chan, R. V.Paul
AU - Hartnett, M. Elizabeth
AU - Moshfeghi, Darius M.
AU - Owen, Leah A.
AU - Kalpathy-Cramer, Jayashree
AU - Chiang, Michael F.
AU - Campbell, J. Peter
N1 - Publisher Copyright:
© 2021
PY - 2021/12
Y1 - 2021/12
N2 - Purpose: Generative adversarial networks (GANs) are deep learning (DL) models that can create and modify realistic-appearing synthetic images, or deepfakes, from real images. The purpose of our study was to evaluate the ability of experts to discern synthesized retinal fundus images from real fundus images and to review the current uses and limitations of GANs in ophthalmology. Design: Development and expert evaluation of a GAN and an informal review of the literature. Participants: A total of 4282 image pairs of fundus images and retinal vessel maps acquired from a multicenter ROP screening program. Methods: Pix2Pix HD, a high-resolution GAN, was first trained and validated on fundus and vessel map image pairs and subsequently used to generate 880 images from a held-out test set. Fifty synthetic images from this test set and 50 different real images were presented to 4 expert ROP ophthalmologists using a custom online system for evaluation of whether the images were real or synthetic. Literature was reviewed on PubMed and Google Scholars using combinations of the terms ophthalmology, GANs, generative adversarial networks, ophthalmology, images, deepfakes, and synthetic. Ancestor search was performed to broaden results. Main Outcome Measures: Expert ability to discern real versus synthetic images was evaluated using percent accuracy. Statistical significance was evaluated using a Fisher exact test, with P values ≤ 0.05 thresholded for significance. Results: The expert majority correctly identified 59% of images as being real or synthetic (P = 0.1). Experts 1 to 4 correctly identified 54%, 58%, 49%, and 61% of images (P = 0.505, 0.158, 1.000, and 0.043, respectively). These results suggest that the majority of experts could not discern between real and synthetic images. Additionally, we identified 20 implementations of GANs in the ophthalmology literature, with applications in a variety of imaging modalities and ophthalmic diseases. Conclusions: Generative adversarial networks can create synthetic fundus images that are indiscernible from real fundus images by expert ROP ophthalmologists. Synthetic images may improve dataset augmentation for DL, may be used in trainee education, and may have implications for patient privacy.
AB - Purpose: Generative adversarial networks (GANs) are deep learning (DL) models that can create and modify realistic-appearing synthetic images, or deepfakes, from real images. The purpose of our study was to evaluate the ability of experts to discern synthesized retinal fundus images from real fundus images and to review the current uses and limitations of GANs in ophthalmology. Design: Development and expert evaluation of a GAN and an informal review of the literature. Participants: A total of 4282 image pairs of fundus images and retinal vessel maps acquired from a multicenter ROP screening program. Methods: Pix2Pix HD, a high-resolution GAN, was first trained and validated on fundus and vessel map image pairs and subsequently used to generate 880 images from a held-out test set. Fifty synthetic images from this test set and 50 different real images were presented to 4 expert ROP ophthalmologists using a custom online system for evaluation of whether the images were real or synthetic. Literature was reviewed on PubMed and Google Scholars using combinations of the terms ophthalmology, GANs, generative adversarial networks, ophthalmology, images, deepfakes, and synthetic. Ancestor search was performed to broaden results. Main Outcome Measures: Expert ability to discern real versus synthetic images was evaluated using percent accuracy. Statistical significance was evaluated using a Fisher exact test, with P values ≤ 0.05 thresholded for significance. Results: The expert majority correctly identified 59% of images as being real or synthetic (P = 0.1). Experts 1 to 4 correctly identified 54%, 58%, 49%, and 61% of images (P = 0.505, 0.158, 1.000, and 0.043, respectively). These results suggest that the majority of experts could not discern between real and synthetic images. Additionally, we identified 20 implementations of GANs in the ophthalmology literature, with applications in a variety of imaging modalities and ophthalmic diseases. Conclusions: Generative adversarial networks can create synthetic fundus images that are indiscernible from real fundus images by expert ROP ophthalmologists. Synthetic images may improve dataset augmentation for DL, may be used in trainee education, and may have implications for patient privacy.
KW - Deep learning
KW - Generative adversarial networks
KW - Ophthalmology
KW - Synthetic images
UR - http://www.scopus.com/inward/record.url?scp=85128779566&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85128779566&partnerID=8YFLogxK
U2 - 10.1016/j.xops.2021.100079
DO - 10.1016/j.xops.2021.100079
M3 - Article
AN - SCOPUS:85128779566
SN - 2666-9145
VL - 1
JO - Ophthalmology Science
JF - Ophthalmology Science
IS - 4
M1 - 100079
ER -