TY - JOUR
T1 - Credentialing individual samples for proteogenomic analysis
AU - Zhao, Wei
AU - Li, Jun
AU - Akbani, Rehan
AU - Liang, Han
AU - Mills, Gordon B.
N1 - Funding Information:
* This study was supported in part by grants from the U.S. National Institutes of Health (CA175486 to H.L. and CA209851 to H.L. and G.B.M., and CCSG grant CA016672); a grant from the Cancer Prevention and Research Institute of Texas (RP140462 to H.L.). □S This article contains supplemental material. ¶ To whom correspondence should be addressed: Department of Systems Biology, The University of Texas M.D. Anderson Cancer Center, Houston, TX 77030. Tel.: (+1)713-563-4223; E-mail: wzhao3@mdanderson.org.
Publisher Copyright:
© 2018 Zhao et al.
PY - 2018/8
Y1 - 2018/8
N2 - An integrated analysis of DNA, RNA and protein, so called proteogenomic studies, has the potential to greatly increase our understanding of both normal physiology and disease development. However, such studies are challenged by a lack of a systematic approach to credential individual samples resulting in the introduction of noise into the system that limits the ability to identify important biological signals. Indeed, a recent proteogenomic CPTAC study identified 26% of samples as unsatisfactory, resulting in a marked increase in cost and loss of information content. Based on a large-scale analysis of RNA-seq and proteomic data generated by reverse phase protein arrays (RPPA) and by mass spectrometry, we propose a protein-mRNA correlation-based (PMC) score as a robust metric to credential single samples for integrated proteogenomic studies. Samples with high PMC scores have significantly higher protein-mRNA correlation, total protein content and tumor purity. Our results highlight the importance of credentialing individual samples prior to proteogenomic analysis.
AB - An integrated analysis of DNA, RNA and protein, so called proteogenomic studies, has the potential to greatly increase our understanding of both normal physiology and disease development. However, such studies are challenged by a lack of a systematic approach to credential individual samples resulting in the introduction of noise into the system that limits the ability to identify important biological signals. Indeed, a recent proteogenomic CPTAC study identified 26% of samples as unsatisfactory, resulting in a marked increase in cost and loss of information content. Based on a large-scale analysis of RNA-seq and proteomic data generated by reverse phase protein arrays (RPPA) and by mass spectrometry, we propose a protein-mRNA correlation-based (PMC) score as a robust metric to credential single samples for integrated proteogenomic studies. Samples with high PMC scores have significantly higher protein-mRNA correlation, total protein content and tumor purity. Our results highlight the importance of credentialing individual samples prior to proteogenomic analysis.
UR - http://www.scopus.com/inward/record.url?scp=85051000523&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051000523&partnerID=8YFLogxK
U2 - 10.1074/mcp.RA118.000645
DO - 10.1074/mcp.RA118.000645
M3 - Article
C2 - 29716986
AN - SCOPUS:85051000523
SN - 1535-9476
VL - 17
SP - 1515
EP - 1530
JO - Molecular and Cellular Proteomics
JF - Molecular and Cellular Proteomics
IS - 8
ER -