Browse > Article
http://dx.doi.org/10.5808/gi.20076

Machine learning based anti-cancer drug response prediction and search for predictor genes using cancer cell line gene expression  

Qiu, Kexin (Department of Computer Science, Dankook University)
Lee, JoongHo (Department of Computer Science, Dankook University)
Kim, HanByeol (Department of Computer Science, Dankook University)
Yoon, Seokhyun (Department of Computer Science, Dankook University)
Kang, Keunsoo (Department of Microbiology, Dankook University)
Abstract
Although many models have been proposed to accurately predict the response of drugs in cell lines recent years, understanding the genome related to drug response is also the key for completing oncology precision medicine. In this paper, based on the cancer cell line gene expression and the drug response data, we established a reliable and accurate drug response prediction model and found predictor genes for some drugs of interest. To this end, we first performed pre-selection of genes based on the Pearson correlation coefficient and then used ElasticNet regression model for drug response prediction and fine gene selection. To find more reliable set of predictor genes, we performed regression twice for each drug, one with IC50 and the other with area under the curve (AUC) (or activity area). For the 12 drugs we tested, the predictive performance in terms of Pearson correlation coefficient exceeded 0.6 and the highest one was 17-AAG for which Pearson correlation coefficient was 0.811 for IC50 and 0.81 for AUC. We identify common predictor genes for IC50 and AUC, with which the performance was similar to those with genes separately found for IC50 and AUC, but with much smaller number of predictor genes. By using only common predictor genes, the highest performance was AZD6244 (0.8016 for IC50, 0.7945 for AUC) with 321 predictor genes.
Keywords
cell line gene expression data; drug response prediction; machine learning; predictor genes;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Chen T, Guestrin C. Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016 Aug 13-17, San Francisco, CA, USA. New York: ACM, 2016. pp. 785-794.
2 Bisong E. Building Machine Learning and Deep Learning Mdels on Google Cloud Platform: A Comprehensive Guide for Beginners. Berkeley, CA: Apress, 2019. pp. 151-165.
3 Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun 2019;10:1523.   DOI
4 Davies BR, Logie A, McKay JS, Martin P, Steele S, Jenkins R, et al. AZD6244 (ARRY-142886), a potent inhibitor of mitogen-activated protein kinase/extracellular signal-regulated kinase kinase 1/2 kinases: mechanism of action in vivo, pharmacokinetic/pharmacodynamic relationship, and potential for combination in preclinical models. Mol Cancer Ther 2007;6:2209-2219.   DOI
5 Drakos E, Singh RR, Rassidakis GZ, Schlette E, Li J, Claret FX, et al. Activation of the p53 pathway by the MDM2 inhibitor nutlin-3a overcomes BCL2 overexpression in a preclinical model of diffuse large B-cell lymphoma associated with t(14;18) (q32;q21). Leukemia 2011;25:856-867.   DOI
6 Hadley KE, Hendricks DT. Use of NQO1 status as a selective biomarker for oesophageal squamous cell carcinomas with greater sensitivity to 17-AAG. BMC Cancer 2014;14:334.   DOI
7 Yang W, Soares J, Greninger P, Edelman EJ, Lightfoot H, Forbes S, et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res 2013;41:D955-D961.   DOI
8 De Niz C, Rahman R, Zhao X, Pal R. Algorithms for drug sensitivity prediction. Algorithms 2016;9:77.   DOI
9 Kim JH, Yim SH, Jeong YB, Jung SH, Xu HD, Shin SH, et al. Comparison of normalization methods for defining copy number variation using whole-genome SNP genotyping data. Genomics Inform 2008;6:231-234.   DOI
10 Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc B 2005;67:301-320.   DOI
11 Basak D, Pal S, Ch D, Patranabis R. Support vector regression. Neural Inf Process Lett Rev 2007;11:203-224.
12 Rinaldo C, Prodosmo A, Siepi F, Moncada A, Sacchi A, Selivanova G, et al. HIPK2 regulation by MDM2 determines tumor cell response to the p53-reactivating drugs nutlin-3 and RITA. Cancer Res 2009;69:6241-6248.   DOI
13 Liu M, Liu H, Chen J. Mechanisms of the CDK4/6 inhibitor palbociclib (PD 0332991) and its future application in cancer treatment (Review). Oncol Rep 2018;39:901-911.
14 Ashley EA. The precision medicine initiative: a new national effort. JAMA 2015;313:2119-2120.   DOI
15 Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med 2015;372:793-795.   DOI
16 Riddick G, Song H, Ahn S, Walling J, Borges-Rivera D, Zhang W, et al. Predicting in vitro drug sensitivity using Random Forests. Bioinformatics 2011;27:220-224.   DOI
17 Geeleher P, Cox NJ, Huang RS. Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. Genome Biol 2014;15:R47.   DOI
18 Zhang N, Wang H, Fang Y, Wang J, Zheng X, Liu XS. Predicting anticancer drug responses using a dual-layer integrated cell linedrug network model. PLoS Comput Biol 2015;11:e1004498.   DOI
19 Zhang H, Liu J, Fu X, Yang A. Identification of key genes and pathways in tongue squamous cell carcinoma using bioinformatics analysis. Med Sci Monit 2017;23:5924-5932.   DOI
20 Wei D, Liu C, Zheng X, Li Y. Comprehensive anticancer drug response prediction based on a simple cell line-drug complex network model. BMC Bioinformatics 2019;20:44.   DOI
21 Benesty J, Chen J, Huang Y, Cohen I. Noise Reduction in Speech Processing. Vol. 2. Berlin: Springer, 2009. pp. 37-40.
22 Shivakumar P, Krauthammer M. Structural similarity assessment for drug sensitivity prediction in cancer. BMC Bioinformatics 2009;10 Suppl 9:S17.   DOI
23 Wang L, Li X, Zhang L, Gao Q. Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularization. BMC Cancer 2017;17:513.   DOI