DOI QR코드

DOI QR Code

Inferring Undiscovered Public Knowledge by Using Text Mining Analysis and Main Path Analysis: The Case of the Gene-Protein 'brings_about' Chains of Pancreatic Cancer

텍스트마이닝과 주경로 분석을 이용한 미발견 공공 지식 추론 - 췌장암 유전자-단백질 유발사슬의 경우 -

  • 안혜림 (연세대학교 일반대학원 문헌정보학과) ;
  • 송민 (연세대학교 문헌정보학과) ;
  • 허고은 (연세대학교 일반대학원 문헌정보학과)
  • Received : 2015.02.16
  • Accepted : 2015.03.19
  • Published : 2015.03.30

Abstract

This study aims to infer the gene-protein 'brings_about' chains of pancreatic cancer which were referred to in the pancreatic cancer related researches by constructing the gene-protein interaction network of pancreatic cancer. The chains can help us uncover publicly unknown knowledge that would develop as empirical studies for investigating the cause of pancreatic cancer. In this study, we applied a novel approach that grafts text mining and the main path analysis into Swanson's ABC model for expanding intermediate concepts to multi-levels and extracting the most significant path. We carried out text mining analysis on the full texts of the pancreatic cancer research papers published during the last ten-year period and extracted the gene-protein entities and relations. The 'brings_about' network was established with bio relations represented by bio verbs. We also applied main path analysis to the network. We found the main direct 'brings_about' path of pancreatic cancer which includes 14 nodes and 13 arcs. 9 arcs were confirmed as the actual relations emerged on the related researches while the other 4 arcs were arisen in the network transformation process for main path analysis. We believe that our approach to combining text mining analysis with main path analysis can be a useful tool for inferring undiscovered knowledge in the situation where either a starting or an ending point is unknown.

본 연구에서는 췌장암의 유전자-단백질 상호작용 네트워크를 구성하고, 관련 연구에서 주요하게 언급되는 유전자-단백질의 유발관계 사슬을 파악함으로써, 췌장암의 원인을 규명하는 실증적인 연구로 이어질 수 있는 미발견 공공 지식을 제공하려 하였다. 이를 위하여 텍스트마이닝과 주경로 분석을 Swanson의 ABC 모델에 적용해 중간 개념인 B를 방향성을 가진 다단계 모델로 확장하고 가장 의미 있는 경로를 도출하였다. 본 연구의 주제가 된 췌장암의 사례처럼 시작점과 끝점조차 한정할 수 없는 미발견 공공 지식 추론에서 주경로 분석은 유용한 도구가 될 수 있을 것이다.

Keywords

References

  1. 서울대학교병원 의학정보. 췌장암 [online]. [cited 2015.2.5]. .(Seoul National University Hospital Medical Information. Pancreatic Cancer [online]. [cited 2015. 2.5]. .)
  2. 허고은, 송민. 2014. 텍스트 마이닝 기반의 그래프 모델을 이용한 미발견 공공 지식 추론. 정보관리학회지, 31(1): 231-250.(Heo, Go Eun and Song Min. 2014. "Inferring Undiscovered Public Knowledge by Using Text Mining-driven Graph Model." Journal of the Korean Society for information Management, 31(1): 231-250.) https://doi.org/10.3743/KOSIM.2014.31.1.231
  3. Blagosklonny, M. V. and A. B. Pardee. 2002. "Unearthing the gems." Nature, 416(6879): 373-373. https://doi.org/10.1038/416373a
  4. Cameron, D., O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A. P. Sheth, and T. C. Rindflesch. 2013. "A graph-based recovery and decomposition of swanson's hypothesis using semantic predications." Journal of Biomedical Informatics, 46(2): 238-251. https://doi.org/10.1016/j.jbi.2012.09.004
  5. De Nooy, W., A. Mrvar, and V. Batagelj. 2005. Exploratory Social Network Analysis with Pajek. Revised and Expanded Second Edition. New York, USA: Cambridge University Press.
  6. DiGiacomo, R. A., J. M. Kremer, and D. M. Shah. 1989. "Fish oil dietary supplementation in patients with Raynaud's phenomenon: A doubleblind, controlled, prospective study." American Journal of Medicine, 8: 158-164.
  7. Gustafsson, M., M. Hornquist, and A. Lombardi. 2005. "Constructing and analyzing a large-scale gene-to-gene regulatory network Lasso-constrained inference and biological validation." IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2(3): 254-261. https://doi.org/10.1109/TCBB.2005.35
  8. Liu, J. S., and L. Y. Y. Lu. 2011. "An Integrated Approach for Main Path Analysis: Development of the Hirsch Index as an Example." Journal of the American Society for Information Science and Technology, 63(3): 528-542. https://doi.org/10.1002/asi.21692
  9. Mattiazzi, M., T. Curk, I. Krizaj, B. Zupan, and U. Petrovic. 2010. "Inference of the Molecular Mechanism of Action from Genetic Interaction and Gene Expression Data." Omics-A Journal Of Integrative Biology, 14(4): 357-367. https://doi.org/10.1089/omi.2009.0144
  10. Natarajan, J., D. Berrar, W. Dubitzky, C. Hack, Y. Zhang, C. Desesa, J. R. Van Brocklyn, and E. G. Bremer. 2006. "Text mining of full-text journal articles combined with gene expression analysis reveals a relationship between sphingosine-1-phosphate and invasiveness of a glioblastoma cell line." BMC bioinformatics, 7: 373. https://doi.org/10.1186/1471-2105-7-373
  11. NIH. Current Relations in the Semantic Network [online]. [cited 2015.2.15]. .
  12. Popa, O., E. Hazkani-Covo, G. Landan, W. Martin, and T. Dagan. 2011. "Directed networks reveal genomic barriers and DNA repair bypasses to lateral gene transfer among prokaryotes." Genome Research, 21(4): 599-609. https://doi.org/10.1101/gr.115592.110
  13. SecondString Project. Class SoftTFIDF [online]. [cited 2015.2.15]. .
  14. Selga, E., C. Oleaga, S. Ramirez, M. C. de Almagro, V. Noe, and C. J. Ciudad. 2009. "Networking of differentially expressed genes in human cancer cells resistant to methotrexate." Genome Medicine, 1: 83. https://doi.org/10.1186/gm83
  15. Swanson, D. R. 1986a. "Undiscovered public knowledge." The Library Quarterly, 56(2): 103-118. https://doi.org/10.1086/601720
  16. Swanson, D. R. 1986b. "Fish oil, Raynaud's syndrome, and undiscovered public knowledge." Perspectives in biology and medicine, 30(1): 7-18. https://doi.org/10.1353/pbm.1986.0087
  17. Swanson, D. R. 1988. "Migraine and magnesium: eleven neglected connections." Perspectives in biology and medicine, 31(4): 526-557. https://doi.org/10.1353/pbm.1988.0009
  18. Vinayagam, A., U. Stelzl, R. Foulle, S. Plassmann, M. Zenkner, J. Timm, H. E. Assmus, AM. A. ndrade-navarro, and E. E. Wanker. 2011. "A directed protein interaction network for investigating intracellular signal transduction." Science signaling, 4(189): rs8. https://doi.org/10.1126/scisignal.2001446
  19. Xiang-Yi He and Yao-Zong Yuan. 2014. "Advances in pancreatic cancer research: Moving towards early detection." World J Gastroenterol, 20(32): 11241-11248. https://doi.org/10.3748/wjg.v20.i32.11241
  20. Yan, E., Y. Ding, and C. R. Sugimoto. 2011. "P-Rank: An Indicator Measuring Prestige in Heterogeneous Scholarly Networks." Journal of the American Society for Information Science and Technology, 62(3): 467-477. https://doi.org/10.1002/asi.21461