An Efficient Method to Find Accurate Spot-matching Patterns in Protein 2-DE Image Analysis

단백질 2-DE 이미지 분석에서 정확한 스팟 매칭 패턴 검색을 위한 효과적인 방법

  • 김연화 (연세대학교 컴퓨터과학과) ;
  • 이원석 (연세대학교 컴퓨터과학과)
  • Received : 2009.12.29
  • Accepted : 2010.02.12
  • Published : 2010.05.15

Abstract

In protein 2-DE image analysis, the accuracy of spot-matching operation which identifies the spot of the same protein in each 2-DE gel image is intensively influenced by the errors caused by the various experimental conditions. This paper proposes an efficient method to find more accurate spot-matching patterns based on multiple reference gel images in spot-matching pattern analysis in protein 2-DE image analysis. Additionally, in order to improve the reduce the execution time which is increased exponentially along with the increasing number of gel images, a "partition then extension" framework is used to find spot-matching pattern of long length and of higher accuracy. In the experiments on real 2-DE images of human liver tissue are used to confirm the accuracy and the efficiency of the proposed algorithm.

단백질 2-DE 이미지 분석에서 단백질 자체가 가지고 있는 불안정성과 2-DE 실험이 가지고 있는 근본적인 문제점으로 인하여 이미지 스팟 매칭 분석의 정확도가 낮아지게 된다. 이 논문에서는 다중 참조이미지를 사용하여, 스팟 매칭 패턴의 정확도에 큰 영향을 주는 이미지 찌그러짐을 보완하고, 그에 따른 노이즈 스팟 제거와 참조 이미지 품질에 의한 정확도 저하를 최소화하는 방법을 제안하였다. 또한 2-DE 이미지의 데이터 특성에 의하여 이미지 수가 증가할 때 성능이 급격히 떨어지는 문제를 해결하기 위하여, 다중 참조이미지를 사용하여 구축한 스팟 매칭 데이터베이스를 이미지의 생물학적 특성에 의하여 "분할 및 확장" 방법을 사용하여, 정확도를 향상시키는 동시에 패턴 길이를 보장하는 스팟 매칭 패턴을 효과적으로 생성하였다. 실험에서는 실제 인간 2-DE 이미지 데이터를 사용하여 제안한 방법의 타당성을 보여준다.

Keywords

References

  1. Gorg A, Obermaier C, Boguth G, Harder A, Scheibe B, Wildgruber R and Weiss W, "The current state of two-dimensional electrophoresis with immobilized pH gradients," Electrophoresis, 21, pp.1037-1053, 2000. https://doi.org/10.1002/(SICI)1522-2683(20000401)21:6<1037::AID-ELPS1037>3.0.CO;2-V
  2. O'Farrell PH, "High resolution two-dimensional electrophoresis of proteins, Journal of Biology and Chemistry," 250, pp.4007-4021, 1975.
  3. Cagney G. and Emili A., "De novo pepe sequencing and quantitative profiling of complex protein mixtures using mass-coded abundance tagging," Nat. Biotech., 20, pp.163-170, 2002. https://doi.org/10.1038/nbt0202-163
  4. Mann M, Hendrickson RC and Pandey A., "Analysis of proteins and proteomes by mass spectrometry," Biochem., 70, pp.437-443, 2001. https://doi.org/10.1146/annurev.biochem.70.1.437
  5. Senthilkumar D. and Richard AR, "Minimizing variability in two-dimensional electrophoresis gel image analysis," OMICS: A Journal of Integrative Biology, 11(2), pp.225-230, 2007. https://doi.org/10.1089/omi.2007.0018
  6. Spyros G., Gert L. and Michael F., "Limitations of current proteomics technologies," Journal of Chromatography A, 1077, pp.1-18, 2005. https://doi.org/10.1016/j.chroma.2005.04.059
  7. Piers M. and Paul D.: Quantitative and reproducible two-dimensional gel analysis using Phoretix 2D Full, ELECTROPHORESIS, 22(10), pp.2075-2085, Jun 2001. https://doi.org/10.1002/1522-2683(200106)22:10<2075::AID-ELPS2075>3.0.CO;2-C
  8. Appel RD, Palagi PM, Walter D, Vargas JR, Sanchez J, Ravier PC and Hochstrasser DF, "Melanie II -- a third generation software package for analysis of two-dimensional electrophoresis images: I.Feautures and user interface," Electrophoresis, 18(15), pp.2735-2748, 1997. https://doi.org/10.1002/elps.1150181507
  9. Arsi TR, Jussi MS, Tero A, Jan W, Riitta L, Tuula AN and Olli SN, "Comparison of PDQuest and Progenesis software packages in the analysis of two-dimensional electrophoresis gels," PROTEOMICS, 3(10), pp.1936-1946, Jul 2003. https://doi.org/10.1002/pmic.200300544
  10. Karp NA, Feret R, Rubtsov DV and Lilley KS: Comparison of DIGE and post-stained gel electrophoresis with both traditional and SameSpots analysis for quantitative proteomics, Proteomics, 8, pp.948-960, 2008. https://doi.org/10.1002/pmic.200700812
  11. Angelika G, Walter W and Michael JD, "Current two-dimensional electrophoresis technology for proteomics," Proteomics, 4, pp.3665-3685, 2004. https://doi.org/10.1002/pmic.200401031
  12. Clark BN and Gutstein HB, "The myth of automated, high-throughput two-dimensional gel analysis," Proteomics, 8(6), pp.1197-1203, Mar 2008. https://doi.org/10.1002/pmic.200700709
  13. Burdick D., Calimlim M., Flannick J., Gehrke J., Yiu T., "MAFIA: a maximal frequent itemset algorithm," IEEE Transactions on Knowledge and Data Engineering, 17(11), pp.1490-1504, 2005. https://doi.org/10.1109/TKDE.2005.183
  14. http://www.pdb.org/ (accessed on Sep 1th, 2009)