Microarray Probe Design with Multiobjective Evolutionary Algorithm

다중목적함수 진화 알고리즘을 이용한 마이크로어레이 프로브 디자인

  • Published : 2008.08.15

Abstract

Probe design is one of the essential tasks in successful DNA microarray experiments. The requirements for probes vary as the purpose or type of microarray experiments. In general, most previous works use the simple filtering approach with the fixed threshold value for each requirement. Here, we formulate the probe design as a multiobjective optimization problem with the two objectives and solve it using ${\epsilon}$-multiobjective evolutionary algorithm. The suggested approach was applied in designing probes for 19 types of Human Papillomavirus and 52 genes in Arabidopsis Calmodulin multigene family and successfully produced more target specific probes compared to well known probe design tools such as OligoArray and OligoWiz.

프로브(probe) 디자인은 성공적인 DNA 마이크로어레이(DNA microarray) 실험을 위해서 필수적인 작업이다. 프로브가 만족시켜야 하는 조건은 마이크로어레이 실험의 목적이나 방법에 따라 다양하게 정의될 수 있는데, 대부분의 기존 연구에서는 각각의 조건에 대하여 각자 독립적으로 정해진 한계치(threshold) 값을 넘지 않는 프로브를 탐색하는 방법을 취하고 있다. 그러나, 본 연구에서는 프로브 디자인을 두가지 목적함수를 지닌 다중목적함수 최적화 문제(multiobjective optimization problem)로 정의하고, ${\epsilon}$-다중목적함수 진화 알고리즘(${\epsilon}$-multiobjective evolutionary algorithm)을 이용하여 해결하는 방법을 제시한다. 제시된 방법은 19종류의 고위험군 인유두종 바이러스(Human Papillomavirus) 유전자들에 대한 프로브 디자인과 52종류의 애기장대 칼모듈린 유전자군(Arabidopsis Calmodulin multigene family)에 대한 프로브 디자인에 각각 적용되었다. 제안한 방법론을 사용하여 기존의 공개 프로브 디자인 프로그램인 OligoArray 및 OligoWiz에 비해 목표유전사에 더 적합한 프로브를 찾을 수 있었다.

Keywords

References

  1. Gordon, P. M. K. and Sensen, C. W., "Osprey: a comprehensive tool employing novel methods for the design of oligonucleotides for DNA sequencing and microarrays," Nucleic Acids Research, Vol.30, No.17, pp.e133, 2004
  2. Rouillard, J.-M., Zuker, M. and Gulari, E., "OligoArray 2.0: design of oligonucleotide probes for DNA microarrays using a thermodynamic approach," Nucleic Acids Research, Vol.31, No.12, pp.3057-3062, 2003 https://doi.org/10.1093/nar/gkg426
  3. Drmanac, S., Stravropoulos, N. A., Labat, I., Vonau, J., Hauser, B., Soares, M. B. and Drmanac, R., "Gene-representing cDNA clusters defined by hybridization of 57,419 clones from infant brain libraries with short oligonucleotide probes," Genomics, Vol.37, No.1, pp.29-40, 1996 https://doi.org/10.1006/geno.1996.0517
  4. Herwig, R., Schmitt, A. O., Steinfath, M., O'Brien, J., Seidel, H., Meier-Ewert, S., Lehrach, H. and Radelof, U., "Information theoretical probe selection for hybridisation experiments," Bioinformatics, Vol.16, No.10, pp.890-898, 2000 https://doi.org/10.1093/bioinformatics/16.10.890
  5. Wernersson, R. and Nielsen, H., "OligoWiz 2.0- integrating sequence feature annotation into the design of microarray probes," Nucleic Acids Research, Vol.33, Web Server issue, pp.W611-W615, 2005 https://doi.org/10.1093/nar/gki399
  6. He, Z., Wu, L., Li, X., Fields, M. W. and Zhou, J., "Empirical establishment of oligonucleotide probe design criteria," Applied and Environmental Microbiology, Vol.71, No.7, pp.3753-3760, 2005 https://doi.org/10.1128/AEM.71.7.3753-3760.2005
  7. Matveeva, O. V., Shabalina, S. A., Nemtsov, V. A., Tsodikov, A. D., Gesteland, R. F. and Atkins, J. F., "Thermodynamic calculations and statistical correlations for oligo-probes design," Nucleic Acids Research, Vol.31, No.14, pp.4211-4217, 2003 https://doi.org/10.1093/nar/gkg476
  8. Wu, C., Carta, R. and Zhang, L., "Sequence dependence of cross-hybridization on short oligo microarrays," Nucleic Acids Research, Vol.33, No.9, pp.e84, 2005 https://doi.org/10.1093/nar/gni082
  9. Shin, S.-Y., Lee, I.-H. and Zhang, B.-T., "Microarray probe design using $\varepsilon$-multi-objective evolutionary algorithms with thermodynamic criteria," Lecture Notes in Computer Science (EvoBio 2006), Vol.3907, pp.184-195, 2006
  10. Deb, K., Multi-Objective Optimization using Evolutionary Algorithms, John Wiley & Sons, Ltd., 2001
  11. Back, T., Evolutionary Algorithms in Theory and Practice, Oxford University Press, 1996
  12. Zuker, M., "Mfold web server for nucleic acid folding and hybridization prediction," Nucleic Acids Research, Vol.31, No.13, pp.3406-3415, 2003 https://doi.org/10.1093/nar/gkg595
  13. SantaLucia, J. Jr., "A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics," Proceedings of the National Academy of Sciences of the United States of America, Vol.95, No.4, pp.1460-1465, 1998 https://doi.org/10.1073/pnas.95.4.1460
  14. Rozen, S. and Skaletsky, H., "Primer3 on the WWW for general users and for biologist programmers," Methods in Molecular Biology, Vol.132, pp.365-386, 2000
  15. Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, T. J., Higgins, D. G. and Thompson, J. D., "Multiple sequence alignment with the Clustal series of programs," Nucleic Acids Research, Vol.31, No.13, pp.3497-3500, 2003 https://doi.org/10.1093/nar/gkg500
  16. Deb, K., Mohan, M. and Mishra, S., "A fast multi-objective evolutionary algorithm for finding well-spread Pareto-optimal solutions," Kanpur Genetic Algorithm Laboratory, Indian Institute of Technology Kanpur, KanGAL Report 2003002, 2003
  17. Laumanns, M., Thiele, L., Deb, K. and Zitzler, E., "Combining convergence and diversity in evolutionary multiobjective optimization," Evolutionary Computation, Vol.10, No.3, pp.263-282, 2002 https://doi.org/10.1162/106365602760234108
  18. Kent, W. J., "BLAT-the BLAST-like alignment tool," Genome Research, Vol.12, No.4, pp.656-664, 2002 https://doi.org/10.1101/gr.229202
  19. Shin, S.-Y., Jang, H.-Y., Tak, M.-H. and Zhang, B.-T., "Simulation of DNA hybridization chain reaction based on thermodynamics and artificial chemistry," Preliminary Proceedings of 9th International Meeting on DNA Based Computer, pp.451, 2004
  20. Walboomers, J. M. M., Jacobs, M. V., Manos, M. M., Bosch, F. X., Kummer, J. A., Shah, K. V., Snijders, P. J. F., Peto, J., Meijer, C. J. L. M. and Munoz, N., "Human papillomavirus is a necessary cause of invasive cervical cancer worldwide," The Journal of Pathology, Vol.189, No.1, pp.12-19, 1999 https://doi.org/10.1002/(SICI)1096-9896(199909)189:1<12::AID-PATH431>3.0.CO;2-F
  21. McCormack, E., Tasi, Y.-C. and Braam, J., "Handling calcium signaling: Arabidopsis CaMs and CMLs," Trends in Plant Science, Vol.10, No.8, pp.383-389, 2005 https://doi.org/10.1016/j.tplants.2005.07.001
  22. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D. J., "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs," Nucleic Acids Research, Vol.25, No.17, pp.3389-3402, 1997 https://doi.org/10.1093/nar/25.17.3389
  23. Rhee, S.Y., Beavis, W., Berardini, T. Z., Chen, G., Dixon, D., Doyle, A., Garcia-Hernandez, M., Huala, E., Lander, G., Montoya, M., Miller, N., Mueller, L. A., Mundodi, S., Reiser, L., Tacklind, J., Weems, D. C., Wu, Y., Xu, I., Yoo, D., Yoon, J. and Zhang, P., "The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community," Nucleic Acids Research, Vol.31, No.1, pp.224- 228, 2003 https://doi.org/10.1093/nar/gkg076