DOI QR코드

DOI QR Code

Prediction of SNP interactions in complex diseases with mutual information and boolean algebra

상호정보와 부울대수를 이용한 복합질환의 SNP 상호작용 예측

  • Received : 2010.09.01
  • Accepted : 2010.10.11
  • Published : 2010.11.30

Abstract

Most chronic diseases are complex diseases which are caused by interactions of several genes. Studies on finding SNPs and gene-gene interactions involved in the development of complex diseases can contribute to prevention and treatment of the diseases. Previous studies mostly concentrate on finding only the set of SNPs involved. In this study we suggest a way to see how these SNPs interact using boolean expressions. The proposed method consists of two stages. In the first stage we find the set of SNPs involved in the development of diseases using mutual information based on entropy. In the second stage we find the highest accuracy boolean expression that consists of the SNP set obtained in the first stage. We experimented with clinical data to demonstrate the effectiveness of the proposed method. We also compared the differences between our method and the previous results on the SNP associations studies.

대부분의 만성질환은 다수의 유전자-유전자 사이의 상호작용에 의해서 발병하는 복합질환이다. 복합질환의 발병에 관여하는 단일염기다형성(single nucleotide polymorphism; SNP)과 유전자 사이의 상호작용을 찾아내는 연구는 질환의 예방과 치료에 기여한다. 기존의 연구 방법은 주로 특정 유전자 내 SNP 조합을 찾아내는 데 그치고 있다. 본 연구에서는 SNP 조합의 구성원 사이에 일어나는 구체적인 상호작용을 나타내는 부울식을 찾는 방법을 제시한다. 본 논문에서 제안하는 방법은 두 단계로 이루어진다. 제 1 단계에서는 엔트로피에 기반한 상호정보를 이용하여 발병에 관여하는 SNP 조합을 찾는다. 제 2단계에서는 제 1 단계에서 찾은 SNP 조합으로 이루어지는 부울식 중에서 발병 예측정확도가 가장 높은 부울식을 찾는다. 제안한 방법을 임상자료에 적용하여 그 효율성을 실험하였으며 기존 연구들과 장단점을 비교하였다.

Keywords

References

  1. A. Tarca, V. Carey, X. Chen, R. Romero, and S. Draghici, "Machine Learning and Its Applications to Biology," PLoS Computational Biology, Vol. 3, No. 6, pp.953-963, Jun. 2007.
  2. Y. Zhang and J. Liu, "Bayesian inference of epistatic interactions in case-control studies," Nature Genetics, Vol. 39, No. 9, pp. 1167-1173, Aug. 2007. https://doi.org/10.1038/ng2110
  3. M. Ritchie, L. Hahn, N. Roodi, L. Bailey, W. Dupont, F. Parl, and J. Moore, "Multifactordimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer," American Journal of Human Genetics, Vol. 69, No. 1, pp.138-147, Jul. 2001. https://doi.org/10.1086/321276
  4. C. Kooperberg, I. Ruczinski, M. LeBalnc, and L. Hsu, "Sequence analysis using logic regression," Genetic epidemiology, Vol. 21, pp. 626-631, 2001. https://doi.org/10.1002/gepi.2001.21.s1.s626
  5. M. Nelson, S. Kardia, R. Ferrell, and C. Sing, "A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation," Genome Research, Vol. 11, pp.458-470, Jan. 2001. https://doi.org/10.1101/gr.172901
  6. R. Culverhouse, T. Klevin, and W. Shannon, "Detecting epistatic interactions contributing to quantitative traits," Genetic Epidemiology, Vol. 27, No. 2, pp. 141-152, Sept. 2004. https://doi.org/10.1002/gepi.20006
  7. C. Kooperberg and I. Ruczinski, "Identifying interacting SNPs using Monte Carlo logistic regression," Genetic Epidemiology, Vol. 28, No. 2, pp. 157-170, Feb. 2005. https://doi.org/10.1002/gepi.20042
  8. X. Chen, C. Liu, M. Zhang, and H. Zhang, "A forest-based approach to identifying gene and gene-gene interactions," Proceedings of the National Academy of Sciences of the United States of America, Vol. 104, No. 49, pp.19199-19203, Dec. 2007.
  9. X. Wan, C. Yang, Q. Yang, H. Xue, N. Tang, and W. Yu, "MegaSNPHunter: a learning approach to detect disease prediction SNPs and high level interactions in genome wide association study," BMC Bioinformatics, Vol. 10, No.13, Jan. 2009.
  10. T. Cover and J. Thomas, "Elements of information theory", 2nd ed., Wiley, 2006.
  11. 이재원, 박미라, 유한나, "생명과학 연구를 위한 통계적 방법," 자유 아카데미, 2005.
  12. S. Kim, J. Bae, C. Suh, D. Nahm, J. Holloway, H. Park, "Polymorphism of tandem repeat in promoter of 5-lipoxygenase in ASA-intolerant asthma: a positive association with airway hyperresponsiveness," Allergy, Vol. 60, No. 6, pp.760-765, Jun. 2005. https://doi.org/10.1111/j.1398-9995.2005.00780.x
  13. D. Contopoulos-Ioannidis, E. Manoli, and J. Ioannidis, "Meta-analysis of the association of beta2-adrenergic receptor polymorphisms with asthma phenotypes," Journal of Allergy and Clinical Immunology, Vol. 115, No. 5, pp. 963-72, May. 2005. https://doi.org/10.1016/j.jaci.2004.12.1119
  14. A. Litonjua, "The significance of beta2-adrenergic receptor polymorphisms in asthma," Current Opinion in Pulmonary Medicine, Vol. 12, No. 1, pp. 12-17, Jan. 2006. https://doi.org/10.1097/01.mcp.0000198068.50457.95
  15. K. Fukunaga, K. Asano, X. Mao, P. Gao, M. Roberts, T. Oguma, T. Shiomi, M. Kanazawa, C. Adra, T. Shirakawa, J. Hopkin, and K. Yamaguchi, "Genetic polymorphisms of CC chemokine receptor 3 in Japanese and British asthmatics," European Respiratory Journal, Vol. 17, pp. 59-46, Jan. 2001. https://doi.org/10.1183/09031936.01.17100590
  16. S. Kim, J. Oh, Y. Kim, L. Palmer, C, Suh, D. Nahm, and H. Park, "Cysteinyl leukotriene receptor 1 promoter polymorphism is associated with aspirin-intolerant asthma in males," Clinical & Experimental Allergy, Vol. 36, No. 4, pp. 433-439, Apr. 2006. https://doi.org/10.1111/j.1365-2222.2006.02457.x
  17. C. Tasi, L. Lai, J. Lee, F. Chiang, J. Hwang, M. Ritchie, J. Moore, K. Hsu, C. Tseng, C. Liau, and Y. Tseng, "Renin-angiotensin system gene polymorphisms and atrial fibrillation," Circulation, Vol. 109, No. 13, pp. 1640-1646, Mar. 2004. https://doi.org/10.1161/01.CIR.0000124487.36586.26
  18. J. Moore, L. Hahn, and M. Ritchie, "Power of multifactor dimensionality reduction for detecting gene-gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity," Genetic Epidemiology, Vol. 24, No. 2, pp. 150-157, Feb. 2003. https://doi.org/10.1002/gepi.10218
  19. K. Liang, Y. Hwang, W. Shao, E. Chen, "An algorithm for model construction and its applications to pharmacogenomic studies," Journal of Human Genetics, Vol.51, pp. 751– 759, Aug. 2006. https://doi.org/10.1007/s10038-006-0016-2
  20. J. Moore, F. Asselberg, S. Williams, "Bioinformatics challenges for genome-wide association studies," Bioinformatics, Vol. 26, No. 4, pp. 445-455, Jan. 2010. https://doi.org/10.1093/bioinformatics/btp713
  21. X. Wan, C. Yang, Q. Yang, H. X, N. Tang, W. Yu, "Predictive rule inference for epistatic interaction detection in genome-wide association studies," Bioinformatics, Vol. 26, No. 1, pp. 30-37, Oct. 2010. https://doi.org/10.1093/bioinformatics/btp622
  22. T. Cormen, C. Leiserson, R. Rivest, C. Stein, "Introduction to algorithms," 3rd ed., MIT Press, 2009.
  23. 신진섭, 안우영, 오일용, "생체정보측정을 통한 진단시스템 개발," 한국컴퓨터정보학회논문지, 제 13권, 제 1호, 219-226쪽, 2008년 1월.
  24. 김광백, 우영운, "개선된 퍼지 ART 알고리즘을 이용한 한방 자가 진단 시스템," 한국컴퓨터정보학회논문지, 제 15권, 제 2호, 27-34쪽, 2010년 2월. https://doi.org/10.9708/jksci.2010.15.2.027

Cited by

  1. 텍스트 마이닝 및 자동 추론 기반 생물학 지식 발견 시스템을 위한 확률 기반 필터링 vol.17, pp.2, 2012, https://doi.org/10.9708/jksci.2012.17.2.139