Browse > Article
http://dx.doi.org/10.3837/tiis.2015.08.026

An Efficient Design and Implementation of an MdbULPS in a Cloud-Computing Environment  

Kim, Myoungjin (Department of Internet and Multimedia Engineering, Konkuk University)
Cui, Yun (Department of Internet and Multimedia Engineering, Konkuk University)
Lee, Hanku (Department of Internet and Multimedia Engineering, Konkuk University)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.9, no.8, 2015 , pp. 3182-3202 More about this Journal
Abstract
Flexibly expanding the storage capacity required to process a large amount of rapidly increasing unstructured log data is difficult in a conventional computing environment. In addition, implementing a log processing system providing features that categorize and analyze unstructured log data is extremely difficult. To overcome such limitations, we propose and design a MongoDB-based unstructured log processing system (MdbULPS) for collecting, categorizing, and analyzing log data generated from banks. The proposed system includes a Hadoop-based analysis module for reliable parallel-distributed processing of massive log data. Furthermore, because the Hadoop distributed file system (HDFS) stores data by generating replicas of collected log data in block units, the proposed system offers automatic system recovery against system failures and data loss. Finally, by establishing a distributed database using the NoSQL-based MongoDB, the proposed system provides methods of effectively processing unstructured log data. To evaluate the proposed system, we conducted three different performance tests on a local test bed including twelve nodes: comparing our system with a MySQL-based approach, comparing it with an Hbase-based approach, and changing the chunk size option. From the experiments, we found that our system showed better performance in processing unstructured log data.
Keywords
NoSQL; MongoDB; Cloud Computing; Hadoop; Big data processing; Banking System;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. Pokorny, “NoSQL databases: a step to database scalability in web environment,” iiWAS '11 in Proc. of the 13th International Conference on Information Integration and Web-based Applications and Services, pp. 278-283, 2011. Article (CrossRef Link)
2 M. Stonebraker, “SQL databases v. NoSQL databases,” Communications of the ACM, vol. 53, no. 4, pp. 10-11, 2010. Article (CrossRef Link)   DOI
3 Z. Wei-Ping, L. Ming-Xin, and C. Huan, "Using MongoDB to implement textbook management system instead of MySQL," in Proc. of Communication Software and Networks (ICCSN) 2011 IEEE 3rd International Conference on, pp. 303-305, 2011. Article (CrossRef Link)
4 D. Agrawal, S. Das, and A. EI Abbadi, “Big data and cloud computing: current state and future opportunities,” EDBT/ICDT '11 in Proc. of the 14th International Conference on Extending, pp. 503-533, 2011. Article (CrossRef Link)
5 M. Kim, S. Han, Y. Cui, H. Lee, H. Cho, S. Hwang, “CloudDMSS: Robust Hadoop-based multimedia streaming service architecture for a cloud computing environment,” Cluster Computing, vol. 17, no. 3, pp. 605-628, 2014. Article (CrossRef Link)   DOI
6 Apache Storm, https://storm.apache.org/
7 Apache Spark, https://spark.apache.org/
8 Apache Kafka, http://kafka.apache.org/
9 X. Wang, H. Chen, and Z. Wang, "Research on improvement of dynamic load balancing in MongoDB," in Proc. of 11th International Conf. on Dependable, Automatic and Secure Computing, pp. 124-130, 2013. Article (CrossRef Link)
10 K. N. Duan, "Migration of stored procedure to distributed cloud database," in Proc. on Advances in Materials Science and Information Technologies in Industry, pp. 513-517, 2014. Article (CrossRef Link)
11 D. Borthakur, et al., "Apache Hadoop goes realtime at Facebook," in Proc. of the ACM SIGMOD, pp. 1071-1080, 2011. Article (CrossRef Link)
12 J. Dean, and S. Ghemawat, “MapReduce: A flexible data processing tool,” Communication of the ACM, vol. 53, no. 1, pp. 72-77, 2010. Article (CrossRef Link)   DOI
13 M. N. Vora, "Hadoop-Hbase for large-scale data," in Proc. of 2011 International Conference on Computer Science and Network Technology, pp. 601-604, 2011. Article (CrossRef Link)
14 Z. Liu, Y. Wang, and R. Lin, "A novel development and analysis solution to PaaS log by using CouchDB," in Proc. of IC-NIDC 2012, pp. 251-255, 2012. Article (CrossRef Link)
15 Apache Hadoop, http://hadoop.apache.org/
16 M. Y. Eltabakh, et al., "CoHadoop: Flexible data placement and its exploitation in Hadoop," Proceedings of the VLDB Endowment, vol. 4, no. 9, pp. 575-585, 2011. Article (CrossRef Link)
17 Y. Zhang, “Web data mining technology on cloud computing,” Applied Mechanics and Materials, vol. 543-547, pp. 3490-3493, 2014. Article (CrossRef Link)   DOI
18 A. Boicea, F. Radulescu, and L. I. Agapin, "MongoDB vs Oracle - Database comparison, in Proc. of 3rd International Conference on Emerging Intelligent Data and Web Technologies, pp. 330-335, 2012. Article (CrossRef Link)
19 S. Lombardo, E. Di Nitto, and D. Ardagna, “Issues in handling complex data structures with NoSQL databases,” Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) 2012 14th International Symposium on, pp. 443-448, 2012. Article (CrossRef Link)
20 NoSQL, http://en.wikipedia.org/wiki/NoSQL
21 MongoDB, http://www.MongoDB.org/
22 J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM - 50th Anniversary Issue: 1958 – 2008, vol. 51, no. 1, pp. 107-113, 2008. Article (CrossRef Link)
23 X. Wu, X. Zhu, G. Wu, and W. Ding, “Data mining with big data,” IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 1, pp. 97-107, 2014. Article (CrossRef Link)   DOI
24 K. Shavchko, H. Kuang, S. Radia, and R. Chansler, “The Hadoop distributed file system,” Mass Storage Systems and Technologies (MSST) 2010 IEEE 26th Symposium on, pp. 1-10, 2010. Article (CrossRef Link)
25 J. Shi, H. Li, Y. Hu, and E. Huang, “Research on key technologies of cloud computing,” International Journal of Digital Content Technology and its Applications, vol. 6, no. 20, pp. 438-445, 2012. Article (CrossRef Link)   DOI
26 M. Chen, S. Mao, and Y. Liu, “Big data: A survey,” Mobile Networks and Applications, vol. 19, no 2, pp. 171-209, 2014. Article (CrossRef Link)   DOI
27 C. Ji, Y. Li, W. Qiu, Y. Jin, Y. Xu, U. Awada, K. Li, and W. Qu, "Big data processing: Big challenges and opportunities," Journal of Interconnection Networks, vol. 13, no 3-4, article no. 125009, 2012. Article (CrossRef Link)
28 Y. Liu, Y. Wang, and Y, Jin, "Research on the improvement of MongoDB Auto-Sharding in cloud environment," in Proc. of International Conf. on Computer Science and Education, pp. 851-854, 2012. Article (CrossRef Link)
29 H. Lee, “Design and implementation of web attack detection system based on integrated web audit data,” Review of Korean Society for Internet Information, vol. 11, no. 6, pp. 73-86, 2010.
30 J. Han, E. Haihong, G. Le, and J. Du, "Survey on NoSQL database," in Proc. of Pervasive Computing and Applications (ICPCA) 2011 6th International Conference on, pp. 363-366, 2011. Article (CrossRef Link)
31 T. Yoon, S. Lee, K. Yoonm, and J. Lee, “Design and application of multiconcept keyword model based on web-using information,” Review of Korean Society for Internet Information, vol. 10, no. 5, pp. 95-1105, 2009.
32 U. Park, “A database schema integration method using XML schema,” Review of Korean Society for Internet Information, vol. 3, no. 2, pp. 39-56, 2002.
33 H. Yu, and D. Wang, "Mass log data processing and mining based on Hadoop and cloud computing," in Proc. of Computer Science & Education (ICCSE) 2012 7th International Conference on, pp. 197-202, 2012. Article (CrossRef Link)
34 N. Leavitt, “Will NoSQL databases live up to their promise?,” Computer, vol. 43, no. 2, pp. 12-14, 2010. Article (CrossRef Link)   DOI
35 R. Hecht, and S. Jablonski, "NoSQL evaluation: A use case oriented survey," in Proc. of Cloud and Service Computing(CSC) 2011 International Conference on, pp. 336-341, 2011. Article (CrossRef Link)