Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records

Xi Sheryl Zhang; Fengyi Tang; Hiroko H. Dodge; Jiayu Zhou; Fei Wang

doi:10.1145/3292500.3330779

Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records

Xi Sheryl Zhang, Fengyi Tang, Hiroko H. Dodge, Jiayu Zhou, Fei Wang

Neurology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

75 Scopus citations

Abstract

In recent years, large amounts of health data, such as patient Electronic Health Records (EHR), are becoming readily available. This provides an unprecedented opportunity for knowledge discovery and data mining algorithms to dig insights from them, which can, later on, be helpful to the improvement of the quality of care delivery. Predictive modeling of clinical risks, including in-hospital mortality, hospital readmission, chronic disease onset, condition exacerbation, etc., from patient EHR, is one of the health data analytic problems that attract lots of the interests. The reason is not only because the problem is important in clinical settings, but also is challenging when working with EHR such as sparsity, irregularity, temporality, etc. Different from applications in other domains such as computer vision and natural language processing, the data samples in medicine (patients) are relatively limited, which creates lots of troubles for building effective predictive models, especially for complicated ones such as deep learning. In this paper, we propose MetaPred, a meta-learning framework for clinical risk prediction from longitudinal patient EHR. In particular, in order to predict the target risk with limited data samples, we train a meta-learner from a set of related risk prediction tasks which learns how a good predictor is trained. The meta-learned can then be directly used in target risk prediction, and the limited available samples in the target domain can be used for further fine-tuning the model performance. The effectiveness of MetaPred is tested on a real patient EHR repository from Oregon Health & Science University. We are able to demonstrate that with Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) as base predictors, MetaPred can achieve much better performance for predicting target risk with low resources comparing with the predictor trained on the limited samples available for this risk alone.

Original language	English (US)
Title of host publication	KDD 2019 - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Publisher	Association for Computing Machinery
Pages	2487-2495
Number of pages	9
ISBN (Electronic)	9781450362016
DOIs	https://doi.org/10.1145/3292500.3330779
State	Published - Jul 25 2019
Event	25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2019 - Anchorage, United States Duration: Aug 4 2019 → Aug 8 2019

Publication series

Name	Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

Conference	25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2019
Country/Territory	United States
City	Anchorage
Period	8/4/19 → 8/8/19

Keywords

Clinical risk prediction
Electronic health records
Meta-learning

ASJC Scopus subject areas

Software
Information Systems

Access to Document

10.1145/3292500.3330779

Cite this

Zhang, X. S., Tang, F., Dodge, H. H., Zhou, J., & Wang, F. (2019). Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records. In KDD 2019 - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 2487-2495). (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining). Association for Computing Machinery. https://doi.org/10.1145/3292500.3330779

Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records. / Zhang, Xi Sheryl; Tang, Fengyi; Dodge, Hiroko H. et al.
KDD 2019 - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2019. p. 2487-2495 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Zhang, XS, Tang, F, Dodge, HH, Zhou, J & Wang, F 2019, Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records. in KDD 2019 - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, pp. 2487-2495, 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2019, Anchorage, United States, 8/4/19. https://doi.org/10.1145/3292500.3330779

Zhang XS, Tang F, Dodge HH, Zhou J, Wang F. Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records. In KDD 2019 - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery. 2019. p. 2487-2495. (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining). doi: 10.1145/3292500.3330779

Zhang, Xi Sheryl ; Tang, Fengyi ; Dodge, Hiroko H. et al. / Metapred : Meta-learning for clinical risk prediction with limited patient electronic health records. KDD 2019 - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2019. pp. 2487-2495 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).

@inproceedings{63b3f4f1d76045509c161412e0caeeb8,

title = "Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records",

abstract = "In recent years, large amounts of health data, such as patient Electronic Health Records (EHR), are becoming readily available. This provides an unprecedented opportunity for knowledge discovery and data mining algorithms to dig insights from them, which can, later on, be helpful to the improvement of the quality of care delivery. Predictive modeling of clinical risks, including in-hospital mortality, hospital readmission, chronic disease onset, condition exacerbation, etc., from patient EHR, is one of the health data analytic problems that attract lots of the interests. The reason is not only because the problem is important in clinical settings, but also is challenging when working with EHR such as sparsity, irregularity, temporality, etc. Different from applications in other domains such as computer vision and natural language processing, the data samples in medicine (patients) are relatively limited, which creates lots of troubles for building effective predictive models, especially for complicated ones such as deep learning. In this paper, we propose MetaPred, a meta-learning framework for clinical risk prediction from longitudinal patient EHR. In particular, in order to predict the target risk with limited data samples, we train a meta-learner from a set of related risk prediction tasks which learns how a good predictor is trained. The meta-learned can then be directly used in target risk prediction, and the limited available samples in the target domain can be used for further fine-tuning the model performance. The effectiveness of MetaPred is tested on a real patient EHR repository from Oregon Health & Science University. We are able to demonstrate that with Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) as base predictors, MetaPred can achieve much better performance for predicting target risk with low resources comparing with the predictor trained on the limited samples available for this risk alone.",

keywords = "Clinical risk prediction, Electronic health records, Meta-learning",

author = "Zhang, {Xi Sheryl} and Fengyi Tang and Dodge, {Hiroko H.} and Jiayu Zhou and Fei Wang",

note = "Publisher Copyright: {\textcopyright} 2019 Copyright is held by the owner/author(s). Publication rights licensed to ACM.; 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2019 ; Conference date: 04-08-2019 Through 08-08-2019",

year = "2019",

month = jul,

day = "25",

doi = "10.1145/3292500.3330779",

language = "English (US)",

series = "Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

publisher = "Association for Computing Machinery",

pages = "2487--2495",

booktitle = "KDD 2019 - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

}

TY - GEN

T1 - Metapred

T2 - 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2019

AU - Zhang, Xi Sheryl

AU - Tang, Fengyi

AU - Dodge, Hiroko H.

AU - Zhou, Jiayu

AU - Wang, Fei

PY - 2019/7/25

Y1 - 2019/7/25

N2 - In recent years, large amounts of health data, such as patient Electronic Health Records (EHR), are becoming readily available. This provides an unprecedented opportunity for knowledge discovery and data mining algorithms to dig insights from them, which can, later on, be helpful to the improvement of the quality of care delivery. Predictive modeling of clinical risks, including in-hospital mortality, hospital readmission, chronic disease onset, condition exacerbation, etc., from patient EHR, is one of the health data analytic problems that attract lots of the interests. The reason is not only because the problem is important in clinical settings, but also is challenging when working with EHR such as sparsity, irregularity, temporality, etc. Different from applications in other domains such as computer vision and natural language processing, the data samples in medicine (patients) are relatively limited, which creates lots of troubles for building effective predictive models, especially for complicated ones such as deep learning. In this paper, we propose MetaPred, a meta-learning framework for clinical risk prediction from longitudinal patient EHR. In particular, in order to predict the target risk with limited data samples, we train a meta-learner from a set of related risk prediction tasks which learns how a good predictor is trained. The meta-learned can then be directly used in target risk prediction, and the limited available samples in the target domain can be used for further fine-tuning the model performance. The effectiveness of MetaPred is tested on a real patient EHR repository from Oregon Health & Science University. We are able to demonstrate that with Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) as base predictors, MetaPred can achieve much better performance for predicting target risk with low resources comparing with the predictor trained on the limited samples available for this risk alone.

AB - In recent years, large amounts of health data, such as patient Electronic Health Records (EHR), are becoming readily available. This provides an unprecedented opportunity for knowledge discovery and data mining algorithms to dig insights from them, which can, later on, be helpful to the improvement of the quality of care delivery. Predictive modeling of clinical risks, including in-hospital mortality, hospital readmission, chronic disease onset, condition exacerbation, etc., from patient EHR, is one of the health data analytic problems that attract lots of the interests. The reason is not only because the problem is important in clinical settings, but also is challenging when working with EHR such as sparsity, irregularity, temporality, etc. Different from applications in other domains such as computer vision and natural language processing, the data samples in medicine (patients) are relatively limited, which creates lots of troubles for building effective predictive models, especially for complicated ones such as deep learning. In this paper, we propose MetaPred, a meta-learning framework for clinical risk prediction from longitudinal patient EHR. In particular, in order to predict the target risk with limited data samples, we train a meta-learner from a set of related risk prediction tasks which learns how a good predictor is trained. The meta-learned can then be directly used in target risk prediction, and the limited available samples in the target domain can be used for further fine-tuning the model performance. The effectiveness of MetaPred is tested on a real patient EHR repository from Oregon Health & Science University. We are able to demonstrate that with Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) as base predictors, MetaPred can achieve much better performance for predicting target risk with low resources comparing with the predictor trained on the limited samples available for this risk alone.

KW - Clinical risk prediction

KW - Electronic health records

KW - Meta-learning

UR - http://www.scopus.com/inward/record.url?scp=85071192943&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071192943&partnerID=8YFLogxK

U2 - 10.1145/3292500.3330779

DO - 10.1145/3292500.3330779

M3 - Conference contribution

AN - SCOPUS:85071192943

T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

SP - 2487

EP - 2495

BT - KDD 2019 - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery

Y2 - 4 August 2019 through 8 August 2019

ER -

Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this