DOI QR코드

DOI QR Code

A machine learning model for the derivation of major molecular descriptor using candidate drug information of diabetes treatment

당뇨병 치료제 후보약물 정보를 이용한 기계 학습 모델과 주요 분자표현자 도출

  • 남궁윤 (연세대학교 융합기술경영공학과) ;
  • 김창욱 (연세대학교 산업공학과) ;
  • 이창준 ((주)닷매틱스)
  • Received : 2019.01.28
  • Accepted : 2019.03.20
  • Published : 2019.03.28

Abstract

The purpose of this study is to find out the structure of the substance that affects antidiabetic using the candidate drug information for diabetes treatment. A quantitative structure activity relationship model based on machine learning method was constructed and major molecular descriptors were determined for each experimental data variables from coefficient values using a partial least squares algorithm. The results of the analysis of the molecular access system fingerprint data reflecting the candidate drug structure information were higher than those of the in vitro data analysis in terms of goodness-of-fit, and the major molecular expression factors affecting the antidiabetic effect were also variously derived. If the proposed method is applied to the new drug development environment, it is possible to reduce the cost for conducting candidate screening experiment and to shorten the search time for new drug development.

본 연구는 당뇨병 치료제 후보약물 정보를 이용하여 항당뇨에 영향을 미치는 물질구조를 발견하는데 목적이 있다. 정량적구조 활성관계를 이용한 기계 학습 모델을 만들고 부분최소자승 알고리즘을 통해 실험데이터 별로 결정계수를 파악한 후 변수중요도척도를 활용하여 주요 분자표현자를 도출하였다. 연구 결과, 후보약물 구조정보를 반영한 molecular access system fingerprint 데이터로 분석한 결과가 in vitro 데이터를 이용한 분석 결과보다 설명력이 높았으며, 항당뇨에 영향을 미치는 주요 분자표현자 역시 다양하게 도출할 수 있었다. 제안된 항당뇨 예측 및 주요인자 분석 방법을 활용한다면 유사한 과정을 반복 실험하는 기존 신약개발 방식과는 달리, 많은 비용과 시간이 소요되는 후보물질 스크리닝 (screening) 기간을 최소화하고, 신약개발 탐색기간도 단축하는 계기가 될 수 있을 것으로 기대한다.

Keywords

OHHGBW_2019_v10n3_23_f0001.png 이미지

Fig. 1. New drug exploration phase

Table 1. Description of variables

OHHGBW_2019_v10n3_23_t0001.png 이미지

Table 2. R2 result for the three experiments

OHHGBW_2019_v10n3_23_t0002.png 이미지

Table 3. Major in vitro variables

OHHGBW_2019_v10n3_23_t0003.png 이미지

Table 4. Major MACCS variable

OHHGBW_2019_v10n3_23_t0004.png 이미지

References

  1. Ministry of Health & Welfare. (2018). Conducted R&D consulting support project for innovative new drugs. Ministry of Health & Welfare. http://www.mohw.go.kr/react/al/sal0301vw.jsp?PAR_MENU_ID=04&MENU_ID=0403&CONT_SEQ=345392&page=1
  2. Ministry of Health & Welfare. (1998). A share of robots in new drug development. Ministry of Health & Welfare. http://www.mohw.go.kr/react/al/sal0301vw.jsp?PAR_MENU_ID=04&MENU_ID=0403&CONT_SEQ=345392&page=1
  3. Ministry of Health & Welfare. (2018). Annual health insurance statistical report 2017. Ministry of Health & Welfare. http://www.mohw.go.kr/react/al/sal0301vw.jsp?PAR_MENU_ID=04&MENU_ID=0403&CONT_SEQ=346196&page=1
  4. Y. J. Jung. (2008). Current trends in diabetes medical treatment. Master thesis. Mokpo University, Mokpo.
  5. J. H. Lim & C. H. Oh. (2013). Medical care utilization status and quality of life in diabetes mellitus patients. The journal of Digital Policy & Management. 11(10), 609-618.
  6. M. J. Lee & H. K. Kang. (2017). Effects of mobile based-healthcare service using human coaching to the self-care of diabetes. Convergence Society for SMB. 7(4), 83-89.
  7. W. E. Chong. (2016). Preclinical evaluation of a new glucokinase activator, YH10561 as a therapeutic drug candidate for type II diabetes mellitus. Doctoral dissertation. Seoul National University. Seoul.
  8. C. M. Park & H. S. Yoon. (2018). Effects of barley noodles contained mulberry leave extracts on blood glucose regulation in diabetic mice. Journal of the Korea Convergence Society, 9(8), 101-108. DOI: 10.1520/JKCS.2018.9.8.101
  9. H. G. Shin. (2017). New drug development QSAR model using computer aided. http://www.ibric.org/myboard/ skin/news1/print.php?Board=news&id=279383
  10. M. J. Sim, (2007), QSAR study of biologically active compounds with biologically activities for diabetes medicine, Master thesis. Yonsei University, Seoul.
  11. D. W. Kim, S. C. Lee, M. J. Kim, E. J. Lee & C. K. Yoo. (2016). Development of QSAR model based on the key molecular descriptors selection and computational toxicology for prediction of toxicity of PCBs. Korean Chem.Eng.Res, 54(5), 621-629. https://doi.org/10.9713/kcer.2016.54.5.621
  12. J. J. Hyeon, M. H. Park, S. H. Shin & Y. G. Shin, (2015). Novel lead optimization strategy using quantitative structure-activity relationship and physiologically-based pharmacokinetics modeling. The pharmaceutical society of korea, 59(4), 151-157.
  13. G. H. Kim, K. G, Lyu, Y. J. Kim & H. C. Kim. (2008). A survey on quantitative structure-activity relationship(QSAR) models. Korean Institute of Information Scientists and Engineers, 35(1A), 43-44.
  14. M. Kratochvil, J. Vondrasek & J. Galgonek. (2018). Sachem: a chemical cartridge for high performance substructure search. Journal of Cheminformatics, 10(1). 1-11. DOI: 10.1186/s13321-018-0282-y
  15. Y. Li, Y. Tian, Q. Qin & A. Yan. (2018). Classification of HIV-1 protease inhibitors by machine learning methods. ACS OMEGA, 3(11), 15837-15849. DOI 10.1021/acsomega.8b01843
  16. K. Rataj, W. Czarnecki, S. Podlewska, A. Pocha & A. Bojarski, (2018). Substructural connectivity fingerptint and extreme entropy machines-A new method of compound representation and analysis. Molecules, 23(6). DOI 10.3390/molecules23061242
  17. L. Ruiz, M. Neito, (2018). A new data representation based on relative measurements and fingerprint patterns for the development of QSAR regression models. Chemometrics and Intelligent Laboratory Systems, 176, 53-65. DOI: 10.1016/j.chemolab.2018.03.007
  18. H. Abdi. (2010). Partial least squares regression and projection on latent structure regression (PLS Regression). WIREs Computational Statistics, 2(1), 97-106. DOI: 10.1002/wics.51
  19. C. K. Yoo, (2002). Statistical method for quality prediction of continuous batch sludge reactor (SBR), Chemical Engineering and Materials Research Information Center, www.cheric.org
  20. C. J. Lee, J. W. Ko & G. B. Lee. (2010). Comparison of partial least squares and support vector machine for the flash point prediction of organic compounds. Korean Chem.Eng.Res, 48(6), 717-724.
  21. B-H. Mevik, R. Wehrens, (2007). The pls package: principal component and partial least squares regression in R. Journal of Statistical Software, 18(2). 1-24.
  22. H. S. Lee, Y. R. Lee, C. H. Jun & J. H. Hong, (2010). A prediction model for coating thickness based on PLS model and variable selection. The Korean Statistical Society, 23(2), 295-304.
  23. S. D. Lee, S. T. Lohuni, B. K. Cho, M. S. Kim & S. H. Lee, (2014). Development of nondestructive detection method for adulterated powder products using raman spectroscopy and partial least squares regression. Journal of the Korean Society for Nondestructive Testing, 34(4), 283-289. DOI: 10.7779/JKSNT.2014.34.4.283
  24. J. Y. Leem, (2016). Discrimination model of cultivation area of corni fructus using a GC-MS based metabolomics approach. Analytical Science & Technology, 29(1), 1-9. DOI: 10.5806/AST.2016.29.1.1
  25. H. H. Lee, S. H. Chung, E. J & E. J. Choi (2016). A case study on machine learning applications and performance improvement in learning algorithm. The Society of Digital Policy & management, 14(2), 245-258.
  26. J. H. Ku, (2017). A study on the machine learning model for product faulty prediction in internet of things environment, Convergence Society for SMB, 7(1), 55-60. https://doi.org/10.22156/CS4SMB.2017.7.1.055
  27. K. J. Park. (2012). Identification of YH-GKA, a novel benzamide glucokinase activator as therapeutic candidate for type 2 diabetes mellitus. Archives of Pharmacal Research, 35(12), 2029-2033. DOI: 10.1007/s12272-012-1201-9
  28. K. J. Park et al. (2013). Discovery of a novel phenylethyl benzamide glucokinase activator for the treatment of type 2 diabetes mellitus. Bioorganic & Medicinal Chemistry Letters, 23(2), 537-542. DOI: 10.1016/j.bmcl.2012.11.018
  29. K. J. Park et al. (2014). Discovery of 3-(4-methanesulfonylphenoxy)-N-[1-(2-methoxy-ethox ymethyl)-1H-pyrazol-3-yl]-5-(3-methylpyridin-2-yl)-b enzamide as a novel glucokinase activator (GKA) for the treatment of type 2 diabetes mellitus. Bioorganic & Medicinal Chemistry, 22(7), 2280-2293. DOI: 10.1016/j.bmc.2014.02.009
  30. K. J. Park, B.M. Lee, K. H. Hyun, T. Han, D.H. Lee & H. H. Choi. (2015). Design and synthesis of acetylenyl benzamide derivatives as novel glucokinase activators for the treatment of T2DM. ACS Medicinal Chemistry Letter, 6(3), 296-301. DOI: 10.1021/ml5004712
  31. WIKIPEDIA. (2018). Simplified molecular-input line-entry system, WIKIPEDIA, https://www.wikipedia.org.
  32. Dotmatics. (2018). Intuitive and versatile scientific data visualization and analysis, Dotmatics, https://dotmaticscom/products/vortex