Browse > Article
http://dx.doi.org/10.5626/JCSE.2016.10.1.21

Data-Compression-Based Resource Management in Cloud Computing for Biology and Medicine  

Zhu, Changming (College of Information Engineering, Shanghai Maritime University)
Publication Information
Journal of Computing Science and Engineering / v.10, no.1, 2016 , pp. 21-31 More about this Journal
Abstract
With the application and development of biomedical techniques such as next-generation sequencing, mass spectrometry, and medical imaging, the amount of biomedical data have been growing explosively. In terms of processing such data, we face the problems surrounding big data, highly intensive computation, and high dimensionality data. Fortunately, cloud computing represents significant advantages of resource allocation, data storage, computation, and sharing and offers a solution to solve big data problems of biomedical research. In order to improve the efficiency of resource management in cloud computing, this paper proposes a clustering method and adopts Radial Basis Function in order to compress comprehensive data sets found in biology and medicine in high quality, and stores these data with resource management in cloud computing. Experiments have validated that with such a data-compression-based resource management in cloud computing, one can store large data sets from biology and medicine in fewer capacities. Furthermore, with reverse operation of the Radial Basis Function, these compressed data can be reconstructed with high accuracy.
Keywords
Biomedical data; Cloud computing; Data compression; Data reconstruction;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. A. Hartigan and M. A. Wong, "Algorithm AS 136: a k-means clustering algorithm," Journal of the Royal Statistical Society Series C (Applied Statistics), vol. 28, no. 1, pp. 100-108, 1979.
2 D. Gao and J. Li, "Kernel fisher discriminants and kernel nearest neighbor classifiers: a comparative study for largescale learning problems," in Proceedings of International Joint Conference on Neural Networks, Vancouver, BC, 2006, pp. 1333-1338.
3 J. Qin, Y. Li, Z. Cai, S. Li, J. Zhu, F. Zhang, Y. Peng, S. Liang, W. Zhang, Y. Guan, et al., "A metagenome-wide association study of gut microbiota in type 2 diabetes," Nature, vol. 490, no. 7418, pp. 55-60, 2012.   DOI
4 E. Pennisi, "Human genome 10th anniversary. Will computers crash genomics?," Science, vol. 11, no. 6018, pp. 666-668, 2011.
5 M. C. Schatz, B. Langmead, and S. L. Salzberg, "Cloud computing and the DNA data race," Nature Biotechnology, vol. 28, no. 7, pp. 691-693, 2010.   DOI
6 "Gathering clouds and a sequencing storm: why cloud computing could broaden community access to next-generation sequencing," Nature Biotechnology, vol. 28, no. 1, 2010. http://dx.doi.org/10.1038/nbt0110-1.   DOI
7 A. Rosenthal, P. Mork, M. H. Li, J. Stanford, D. Koester, and P. Reynolds, "Cloud computing: a new business paradigm for biomedical information sharing," Journal of Biomedical Informatics, vol. 43, no. 2, pp. 342-353, 2010.   DOI
8 A. Darling, L. Carey, and W. C. Feng, "The design, implementation, and evaluation of mpiBLAST," in Proceedings of ClusterWorld Conference & Expo, San Jose, CA, 2003.
9 E. E. Schadt, M. D. Linderman, J. Sorenson, L. Lee, and G. P. Nolan, "Computational solutions to large-scale data management and analysis," Nature Reviews Genetics, vol. 11, no. 9, pp. 647-657, 2010.   DOI
10 D. P. Wall, P. Kudtarkar, V. A. Fusaro, R. Pivovarov, P. Patil, and P. J. Tonellato, "Cloud computing for comparative genomics," BMC Bioinformatics, vol. 11, pp. 1-12, 2010.   DOI
11 L. D. Stein, "The case for cloud computing in genome informatics," Genome Biology, vol. 11, pp. 1-7, 2010.
12 J. T. Dudley, Y. Pouliot, R. Chen, A. A. Morgan, and A. J. Butte, "Translational bioinformatics in the cloud: an affordable alternative," Genome Medicine, vol. 2, pp. 1-6, 2010.   DOI
13 S. Grumbach and F. Tahi, "A new challenge for compression algorithms: genetic sequences," Information Processing & Management, vol. 30, no. 6, pp. 875-886, 1994.   DOI
14 J. Wilkening, A. Wilke, N. Desai, and F. Meyer, "Using clouds for metagenomics: a case study," in Proceedings of IEEE International Conference on Cluster Computing & Workshops, New Orleans, LA, 2009, pp. 1-6.
15 National Institute of Standards and Technology, "The NIST definition of cloud computing," Sep. 2011;http://dx.doi.org/10.6028/NIST.SP.800-145.
16 S. Grumbach and F. Tahi, "Compression of DNA sequences," in Proceedings of Data Compression Conference (DCC'93), Snowbird, UT, 1993, pp. 340-350.
17 X. Chen, S. Kwong, and M. Li, "A compression algorithm for DNA sequences and its applications in genome comparison," Genome Informatics, vol. 10, pp. 51-61, 1999.
18 T. Matsumoto, K. Sadakane, and H. Imai, "Biological sequence compression algorithms," Genome Informatics, vol. 11, pp. 43- 52, 2000.
19 B. Behzadi and F. L. Fessant, "DNA compression challenge revisited: a dynamic programming approach," in Combinatorial Pattern Matching, Heidelberg: Springer, pp. 190-200, 2005.
20 K. G. Srinivasa, M. Jagadish, K. R. Venugopal, and L. M. Patnaik, "Efficient compression of nonrepetitive DNA sequences using dynamic programming," in Proceedings of International Conference on Advanced Computing & Communications, Surathkal, India, 2006, pp. 569-574.
21 G. Korodi and I. Tabus, "An efficient normalized maximum likelihood algorithm for DNA sequence compression," ACM Transactions on Information Systems, vol. 23, no. 1, pp. 3-34, 2005.   DOI
22 W. H. Day and H. Edelsbrunner, "Efficient algorithms for agglomerative hierarchical clustering methods," Journal of Classification, vol. 1, no. 1, pp. 7-24, 1984.   DOI