TY - JOUR
T1 - Predicting cancer cell line dependencies fro the protein expression data of reverse-Phase protein arrays
AU - May Chen, Mei Ju
AU - Li, Jun
AU - Mills, Gordon B.
AU - Liang, Han
N1 - Funding Information:
Supported by Grants No. CA217842 and CA210950 from the National Cancer Institute (G.B.M.), CA175486 (H.L.), CA209851 (H.L. and G.B.M.), and Cancer Center Support Grant No. CA016672; by MD Anderson Faculty Scholar Award; the Lorraine Dell Program in Bioinformatics for Personalization of Cancer Medicine (H.L.), and the Adelson Medical Research Foundation (G.B.M.).
Publisher Copyright:
© 2020 by American Society of Clinical Oncology.
PY - 2020
Y1 - 2020
N2 - PURPOSE Predicting cancer dependencies from molecular data can help stratify patients and identify novel therapeutic targets. Recently available data on large-scale cancer cell line dependency allow a systematic assessment of the predictive power of diverse molecular features; however, the protein expression data have not been rigorously evaluated. By using the protein expression data generated by reverse-phase protein arrays, we aimed to assess their predictive power in identifying cancer dependencies and to develop a related analytic tool for community use. MATERIALS AND METHODS By using a machine learning schema, we conducted an analysis of feature importance based on cancer dependency and multiomic data from the DepMap and Cancer Cell Line Encyclopedia projects. We assessed the consistency of cancer dependency data between CRISPR/Cas9 and short hairpin RNA–mediated perturbation platforms. For a fair comparison, we focused on a set of genes with robust dependency data and four available expression-related features (copy number alteration, DNA methylation, messenger RNA expression, and protein expression) and performed the same-gene predictions of the cancer dependency using different molecular features. RESULTS For the genes surveyed, we observed that the protein expression data contained substantial predictive power for cancer dependencies, and they were the best predictive feature for the CRISPR/Cas9-based dependency data. We also developed a user-friendly protein-dependency analytic module and integrated it with The Cancer Proteome Atlas; this module allows researchers to explore and analyze our results intuitively. CONCLUSION This study provides a systematic assessment for predicting cancer dependencies of cell lines from different expression-related features of a gene. Our results suggest that protein expression data are a highly valuable information resource for understanding tumor vulnerabilities and identifying therapeutic opportunities.
AB - PURPOSE Predicting cancer dependencies from molecular data can help stratify patients and identify novel therapeutic targets. Recently available data on large-scale cancer cell line dependency allow a systematic assessment of the predictive power of diverse molecular features; however, the protein expression data have not been rigorously evaluated. By using the protein expression data generated by reverse-phase protein arrays, we aimed to assess their predictive power in identifying cancer dependencies and to develop a related analytic tool for community use. MATERIALS AND METHODS By using a machine learning schema, we conducted an analysis of feature importance based on cancer dependency and multiomic data from the DepMap and Cancer Cell Line Encyclopedia projects. We assessed the consistency of cancer dependency data between CRISPR/Cas9 and short hairpin RNA–mediated perturbation platforms. For a fair comparison, we focused on a set of genes with robust dependency data and four available expression-related features (copy number alteration, DNA methylation, messenger RNA expression, and protein expression) and performed the same-gene predictions of the cancer dependency using different molecular features. RESULTS For the genes surveyed, we observed that the protein expression data contained substantial predictive power for cancer dependencies, and they were the best predictive feature for the CRISPR/Cas9-based dependency data. We also developed a user-friendly protein-dependency analytic module and integrated it with The Cancer Proteome Atlas; this module allows researchers to explore and analyze our results intuitively. CONCLUSION This study provides a systematic assessment for predicting cancer dependencies of cell lines from different expression-related features of a gene. Our results suggest that protein expression data are a highly valuable information resource for understanding tumor vulnerabilities and identifying therapeutic opportunities.
UR - http://www.scopus.com/inward/record.url?scp=85084031024&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084031024&partnerID=8YFLogxK
U2 - 10.1200/CCI.19.00144
DO - 10.1200/CCI.19.00144
M3 - Article
C2 - 32330068
AN - SCOPUS:85084031024
SN - 2473-4276
VL - 4
SP - 357
EP - 366
JO - JCO clinical cancer informatics
JF - JCO clinical cancer informatics
ER -