Automated Fundus Image Quality Assessment in Retinopathy of Prematurity Using Deep Convolutional Neural Networks

Imaging and Informatics in Retinopathy of Prematurity Research Consortium

doi:10.1016/j.oret.2019.01.015

Automated Fundus Image Quality Assessment in Retinopathy of Prematurity Using Deep Convolutional Neural Networks

Imaging and Informatics in Retinopathy of Prematurity Research Consortium

Research output: Contribution to journal › Article › peer-review

47 Scopus citations

Abstract

Purpose: Accurate image-based ophthalmic diagnosis relies on fundus image clarity. This has important implications for the quality of ophthalmic diagnoses and for emerging methods such as telemedicine and computer-based image analysis. The purpose of this study was to implement a deep convolutional neural network (CNN) for automated assessment of fundus image quality in retinopathy of prematurity (ROP). Design: Experimental study. Participants: Retinal fundus images were collected from preterm infants during routine ROP screenings. Methods: Six thousand one hundred thirty-nine retinal fundus images were collected from 9 academic institutions. Each image was graded for quality (acceptable quality [AQ], possibly acceptable quality [PAQ], or not acceptable quality [NAQ]) by 3 independent experts. Quality was defined as the ability to assess an image confidently for the presence of ROP. Of the 6139 images, NAQ, PAQ, and AQ images represented 5.6%, 43.6%, and 50.8% of the image set, respectively. Because of low representation of NAQ images in the data set, images labeled NAQ were grouped into the PAQ category, and a binary CNN classifier was trained using 5-fold cross-validation on 4000 images. A test set of 2109 images was held out for final model evaluation. Additionally, 30 images were ranked from worst to best quality by 6 experts via pairwise comparisons, and the CNN's ability to rank quality, regardless of quality classification, was assessed. Main Outcome Measures: The CNN performance was evaluated using area under the receiver operating characteristic curve (AUC). A Spearman's rank correlation was calculated to evaluate the overall ability of the CNN to rank images from worst to best quality as compared with experts. Results: The mean AUC for 5-fold cross-validation was 0.958 (standard deviation, 0.005) for the diagnosis of AQ versus PAQ images. The AUC was 0.965 for the test set. The Spearman's rank correlation coefficient on the set of 30 images was 0.90 as compared with the overall expert consensus ranking. Conclusions: This model accurately assessed retinal fundus image quality in a comparable manner with that of experts. This fully automated model has potential for application in clinical settings, telemedicine, and computer-based image analysis in ROP and for generalizability to other ophthalmic diseases.

Original language	English (US)
Pages (from-to)	444-450
Number of pages	7
Journal	Ophthalmology Retina
Volume	3
Issue number	5
DOIs	https://doi.org/10.1016/j.oret.2019.01.015
State	Published - May 2019

ASJC Scopus subject areas

Ophthalmology

Access to Document

10.1016/j.oret.2019.01.015

Cite this

@article{4f71dfaf314e408c997b640a7085a5cf,

title = "Automated Fundus Image Quality Assessment in Retinopathy of Prematurity Using Deep Convolutional Neural Networks",

abstract = "Purpose: Accurate image-based ophthalmic diagnosis relies on fundus image clarity. This has important implications for the quality of ophthalmic diagnoses and for emerging methods such as telemedicine and computer-based image analysis. The purpose of this study was to implement a deep convolutional neural network (CNN) for automated assessment of fundus image quality in retinopathy of prematurity (ROP). Design: Experimental study. Participants: Retinal fundus images were collected from preterm infants during routine ROP screenings. Methods: Six thousand one hundred thirty-nine retinal fundus images were collected from 9 academic institutions. Each image was graded for quality (acceptable quality [AQ], possibly acceptable quality [PAQ], or not acceptable quality [NAQ]) by 3 independent experts. Quality was defined as the ability to assess an image confidently for the presence of ROP. Of the 6139 images, NAQ, PAQ, and AQ images represented 5.6%, 43.6%, and 50.8% of the image set, respectively. Because of low representation of NAQ images in the data set, images labeled NAQ were grouped into the PAQ category, and a binary CNN classifier was trained using 5-fold cross-validation on 4000 images. A test set of 2109 images was held out for final model evaluation. Additionally, 30 images were ranked from worst to best quality by 6 experts via pairwise comparisons, and the CNN's ability to rank quality, regardless of quality classification, was assessed. Main Outcome Measures: The CNN performance was evaluated using area under the receiver operating characteristic curve (AUC). A Spearman's rank correlation was calculated to evaluate the overall ability of the CNN to rank images from worst to best quality as compared with experts. Results: The mean AUC for 5-fold cross-validation was 0.958 (standard deviation, 0.005) for the diagnosis of AQ versus PAQ images. The AUC was 0.965 for the test set. The Spearman's rank correlation coefficient on the set of 30 images was 0.90 as compared with the overall expert consensus ranking. Conclusions: This model accurately assessed retinal fundus image quality in a comparable manner with that of experts. This fully automated model has potential for application in clinical settings, telemedicine, and computer-based image analysis in ROP and for generalizability to other ophthalmic diseases.",

author = "{Imaging and Informatics in Retinopathy of Prematurity Research Consortium} and Coyner, {Aaron S.} and Ryan Swan and Campbell, {J. Peter} and Susan Ostmo and Brown, {James M.} and Jayashree Kalpathy-Cramer and Kim, {Sang Jin} and Jonas, {Karyn E.} and Chan, {R. V.Paul} and Chiang, {Michael F.} and Kemal Sonmez and Chan, {R. V.Paul} and Karyn Jonas and Jason Horowitz and Osode Coki and Eccles, {Cheryl Ann} and Leora Sarna and Anton Orlin and Audina Berrocal and Catherin Negron and Kimberly Denser and Kristi Cumming and Tammy Osentoski and Tammy Check and Mary Zajechowski and Thomas Lee and Evan Kruger and Kathryn McGovern and Charles Simmons and Raghu Murthy and Sharon Galvis and Jerome Rotter and Ida Chen and Xiaohui Li and Kent Taylor and Kaye Roll and Ken Chang and Andrew Beers and Deniz Erdogmus and Stratis Ioannidis and Martinez-Castellanos, {Maria Ana} and Samantha Salinas-Longoria and Rafael Romero and Andrea Arriola and Francisco Olguin-Manriquez and Miroslava Meraz-Gutierrez and Dulanto-Reinoso, {Carlos M.} and Cristina Montero-Mendoza",

note = "Publisher Copyright: {\textcopyright} 2019 American Academy of Ophthalmology",

year = "2019",

month = may,

doi = "10.1016/j.oret.2019.01.015",

language = "English (US)",

volume = "3",

pages = "444--450",

journal = "Ophthalmology Retina",

issn = "2468-7219",

publisher = "Elsevier Inc.",

number = "5",

}

TY - JOUR

T1 - Automated Fundus Image Quality Assessment in Retinopathy of Prematurity Using Deep Convolutional Neural Networks

AU - Imaging and Informatics in Retinopathy of Prematurity Research Consortium

AU - Coyner, Aaron S.

AU - Swan, Ryan

AU - Campbell, J. Peter

AU - Ostmo, Susan

AU - Brown, James M.

AU - Kalpathy-Cramer, Jayashree

AU - Kim, Sang Jin

AU - Jonas, Karyn E.

AU - Chan, R. V.Paul

AU - Chiang, Michael F.

AU - Sonmez, Kemal

AU - Chan, R. V.Paul

AU - Jonas, Karyn

AU - Horowitz, Jason

AU - Coki, Osode

AU - Eccles, Cheryl Ann

AU - Sarna, Leora

AU - Orlin, Anton

AU - Berrocal, Audina

AU - Negron, Catherin

AU - Denser, Kimberly

AU - Cumming, Kristi

AU - Osentoski, Tammy

AU - Check, Tammy

AU - Zajechowski, Mary

AU - Lee, Thomas

AU - Kruger, Evan

AU - McGovern, Kathryn

AU - Simmons, Charles

AU - Murthy, Raghu

AU - Galvis, Sharon

AU - Rotter, Jerome

AU - Chen, Ida

AU - Li, Xiaohui

AU - Taylor, Kent

AU - Roll, Kaye

AU - Chang, Ken

AU - Beers, Andrew

AU - Erdogmus, Deniz

AU - Ioannidis, Stratis

AU - Martinez-Castellanos, Maria Ana

AU - Salinas-Longoria, Samantha

AU - Romero, Rafael

AU - Arriola, Andrea

AU - Olguin-Manriquez, Francisco

AU - Meraz-Gutierrez, Miroslava

AU - Dulanto-Reinoso, Carlos M.

AU - Montero-Mendoza, Cristina

PY - 2019/5

Y1 - 2019/5

N2 - Purpose: Accurate image-based ophthalmic diagnosis relies on fundus image clarity. This has important implications for the quality of ophthalmic diagnoses and for emerging methods such as telemedicine and computer-based image analysis. The purpose of this study was to implement a deep convolutional neural network (CNN) for automated assessment of fundus image quality in retinopathy of prematurity (ROP). Design: Experimental study. Participants: Retinal fundus images were collected from preterm infants during routine ROP screenings. Methods: Six thousand one hundred thirty-nine retinal fundus images were collected from 9 academic institutions. Each image was graded for quality (acceptable quality [AQ], possibly acceptable quality [PAQ], or not acceptable quality [NAQ]) by 3 independent experts. Quality was defined as the ability to assess an image confidently for the presence of ROP. Of the 6139 images, NAQ, PAQ, and AQ images represented 5.6%, 43.6%, and 50.8% of the image set, respectively. Because of low representation of NAQ images in the data set, images labeled NAQ were grouped into the PAQ category, and a binary CNN classifier was trained using 5-fold cross-validation on 4000 images. A test set of 2109 images was held out for final model evaluation. Additionally, 30 images were ranked from worst to best quality by 6 experts via pairwise comparisons, and the CNN's ability to rank quality, regardless of quality classification, was assessed. Main Outcome Measures: The CNN performance was evaluated using area under the receiver operating characteristic curve (AUC). A Spearman's rank correlation was calculated to evaluate the overall ability of the CNN to rank images from worst to best quality as compared with experts. Results: The mean AUC for 5-fold cross-validation was 0.958 (standard deviation, 0.005) for the diagnosis of AQ versus PAQ images. The AUC was 0.965 for the test set. The Spearman's rank correlation coefficient on the set of 30 images was 0.90 as compared with the overall expert consensus ranking. Conclusions: This model accurately assessed retinal fundus image quality in a comparable manner with that of experts. This fully automated model has potential for application in clinical settings, telemedicine, and computer-based image analysis in ROP and for generalizability to other ophthalmic diseases.

AB - Purpose: Accurate image-based ophthalmic diagnosis relies on fundus image clarity. This has important implications for the quality of ophthalmic diagnoses and for emerging methods such as telemedicine and computer-based image analysis. The purpose of this study was to implement a deep convolutional neural network (CNN) for automated assessment of fundus image quality in retinopathy of prematurity (ROP). Design: Experimental study. Participants: Retinal fundus images were collected from preterm infants during routine ROP screenings. Methods: Six thousand one hundred thirty-nine retinal fundus images were collected from 9 academic institutions. Each image was graded for quality (acceptable quality [AQ], possibly acceptable quality [PAQ], or not acceptable quality [NAQ]) by 3 independent experts. Quality was defined as the ability to assess an image confidently for the presence of ROP. Of the 6139 images, NAQ, PAQ, and AQ images represented 5.6%, 43.6%, and 50.8% of the image set, respectively. Because of low representation of NAQ images in the data set, images labeled NAQ were grouped into the PAQ category, and a binary CNN classifier was trained using 5-fold cross-validation on 4000 images. A test set of 2109 images was held out for final model evaluation. Additionally, 30 images were ranked from worst to best quality by 6 experts via pairwise comparisons, and the CNN's ability to rank quality, regardless of quality classification, was assessed. Main Outcome Measures: The CNN performance was evaluated using area under the receiver operating characteristic curve (AUC). A Spearman's rank correlation was calculated to evaluate the overall ability of the CNN to rank images from worst to best quality as compared with experts. Results: The mean AUC for 5-fold cross-validation was 0.958 (standard deviation, 0.005) for the diagnosis of AQ versus PAQ images. The AUC was 0.965 for the test set. The Spearman's rank correlation coefficient on the set of 30 images was 0.90 as compared with the overall expert consensus ranking. Conclusions: This model accurately assessed retinal fundus image quality in a comparable manner with that of experts. This fully automated model has potential for application in clinical settings, telemedicine, and computer-based image analysis in ROP and for generalizability to other ophthalmic diseases.

UR - http://www.scopus.com/inward/record.url?scp=85070432758&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070432758&partnerID=8YFLogxK

U2 - 10.1016/j.oret.2019.01.015

DO - 10.1016/j.oret.2019.01.015

M3 - Article

C2 - 31044738

AN - SCOPUS:85070432758

SN - 2468-7219

VL - 3

SP - 444

EP - 450

JO - Ophthalmology Retina

JF - Ophthalmology Retina

IS - 5

ER -

Automated Fundus Image Quality Assessment in Retinopathy of Prematurity Using Deep Convolutional Neural Networks

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this