Search | Korea Science

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
- Journal of Internet Computing and Services
- /
- v.14 no.6
- /
- pp.71-84
- /
- 2013
Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.
https://doi.org/10.7472/jksii.2013.14.6.71 인용 PDF KSCI

The Object-Oriented Class Hierarchy Structure Design Method using the Rapid Prototyping Techniques (래피드 프로토토입핑 기법을 사용한 객체 지향 클래스 계층 구조 설계 방법)

Heo, Kwae-Bum;Choi, Young-Eun
- The Transactions of the Korea Information Processing Society
- /
- v.5 no.1
- /
- pp.86-96
- /
- 1998
The class hierarchy structure in an object-oriented design model is effective to the software reusabilily and lhe design of complex syslem. This paper suggests lhe objecl-orienled class hierarchy structure design melhod using lhe rapid prololyping lechniques. In this method, relationship recognition and similarity are estimated by the new class classification in object modeling level. Then lhe estimation of aUribute and method in class is needed. Each design module such as class hierarchy struclure which is generaled wilh inleractive and repealed work consisls of reference relationship, inheritance relationship and composite relationship. These information are slored in lhe table to maintenance lhe program and implementation, the class relationship is represented with graph and the node class is iconized. This method is effective in reslructuring of class hierarchy are reusing of design information, because of addition of new class and deletion with ease. The efficiency of syslem analysis, design and implementation is enhanced by converting into prololype system and real system.
PDF

System Optimization Technique using Crosscutting Concern (크로스커팅 개념을 이용한 시스템 최적화 기법)

Lee, Seunghyung;Yoo, Hyun
- Journal of Digital Convergence
- /
- v.15 no.3
- /
- pp.181-186
- /
- 2017
The system optimization is a technique that changes the structure of the program in order to extract the duplicated modules without changing the source code, reuse of the extracted module. Structure-oriented development and object-oriented development are efficient at crosscutting concern modular, however can't be modular of crosscutting concept. To apply the crosscutting concept in an existing system, there is a need to a extracting technique for distributed system optimization module within the system. This paper proposes a method for extracting the redundant modules in the completed system. The proposed method extracts elements that overlap over a source code analysis to analyze the data dependency and control dependency. The extracted redundant element is used to program dependency analysis for the system optimization. Duplicated dependency analysis result is converted into a control flow graph, it is possible to produce a minimum crosscutting module. The element extracted by dependency analysis proposes a system optimization method which minimizes the duplicated code within system by setting the crosscutting concern module.
https://doi.org/10.14400/JDC.2017.15.3.181 인용 PDF KSCI

Regression Testing of Software Evolution by AOP (AOP를 이용하여 진화된 프로그램의 회귀테스트 기법)

Lee, Mi-Jin;Choi, Eun-Man
- The KIPS Transactions:PartD
- /
- v.15D no.4
- /
- pp.495-504
- /
- 2008
Aspect Oriented Programming(AOP) is a relatively new programming paradigm and has properties that other programming paradigms don't have. This new programming paradigm provides new modularization of software systems by cross-cutting concerns. In this paper, we propose a regression test method for program evolution by AOP. By using JoinPoint, we can catch a pointcut-name which makes it possible to test the incorrect pointcut strength fault and the incorrect aspect precedence fault. Through extending proof rules to aspect, we can recognize failures to establish expected postconditions faults. We can also trace variables using set() and get() pointcut and test failures to preserve state invariant fault. Using control flow graph, we can test incorrect changes in control dependencies faults. In order to show the correctness of our proposed method, channel management system is implemented and tested by using proposed methods.
https://doi.org/10.3745/KIPSTD.2008.15-D.4.495 인용 PDF KSCI

The Ecology of the Scientific Literature and Information Retrieval (I)

Jeong, Jun-Min
- Journal of the Korean Society for information Management
- /
- v.2 no.2
- /
- pp.3-37
- /
- 1985
This research deals with the problems encountered in designing systems for more efficient and effective information retrieval used in the proliferation of literature. This research was designed to develop and test 1) the partitioning a large bibliographic data base into quality oriented subsets (quality filtering), and 2) a system for effective and efficient information retrieval within subsets of data base (relevance). In order to accomplish this partitioning, the 'kernel' technique of graph theory was applied. In addition, a method of quality filtering utilizing the 'epidemic' theory and the 'obsolescence' of scientific literature was developed.
PDF

The Ecology of the Scientific Literature and Information Retrieval (II)

Jeong, Jun-Min
- Journal of the Korean Society for information Management
- /
- v.3 no.1
- /
- pp.3-16
- /
- 1986
This research deals with the problems encountered in designing systems for more efficient and effective information retrieval used in the proliferation of literature. This research was designed to develop and test 1) the partitioning a large bibliographic data base into quality oriented subsets (quality filtering), and 2) a system for effective and efficient Information retrieval within subsets of data base (relevance). In order to accomplish this partitioning, the 'kernel' technique of graph theory was applied. In addition, a method of quality filtering utilizing the 'epidemic' theory and the 'obsolescence' of scientific literature was developed.
PDF

Energy-aware Dalvik Bytecode List Scheduling Technique for Mobile Applications (모바일 어플리케이션을 위한 에너지-인식 달빅 바이트코드 리스트 스케줄링 기술)

Ko, Kwang Man
- KIPS Transactions on Computer and Communication Systems
- /
- v.3 no.5
- /
- pp.151-154
- /
- 2014
An energy of applications had consumed through the complexed inter-action with operating systems, run-time environments, compiler, and applications on various mobile devices. In these days, challenged researches are studying to reduce of energy consumptions that uses energy-oriented high-level and low-level compiler techniques on mobile devices. In this paper, we intented to reduce an energy consumption of Java mobile applications that applied a list instruction scheduling for energy dissipation from dalvik bytecode which extracted Android dex files. Through this works, we can construct the optimized power and energy environment on mobile devices with the limited power supply.
https://doi.org/10.3745/KTCCS.2014.3.5.151 인용 PDF KSCI

A Statistics Education Package Tong-Gramy for 5-8 Graders (초중등학생 교육용 통계패키지 통그라미 개발)

Lee, Jung Jin;Lee, Tae Rim;Kang, Gunseog;Kim, Sungsoo;Park, Heon Jin;Lee, Yoon-Dong;Sim, Songyong
- The Korean Journal of Applied Statistics
- /
- v.27 no.3
- /
- pp.487-500
- /
- 2014
The elementary school curriculum includes some statistical concepts and many graphical methods. However, statistical concepts are difficult to understand; consequently, many of those graphs and numerical summaries are obtained by hand. We develop an intuitive statistics education package called Tong-Gramy focused on 5-8 graders to help students and teachers study statistics. This software covers numerical and graphical statistics that appear in 5-8 graders' textbooks. The graphs provided are dynamically linked to data and every graph is linked to every datum. The graphs of Tong-Gramy are dynamic graphs and morphing technology is used where applicable.
https://doi.org/10.5351/KJAS.2014.27.3.487 인용 PDF KSCI

Improvement and Evaluation of the Korean Large Vocabulary Continuous Speech Recognition Platform (ECHOS) (한국어 음성인식 플랫폼(ECHOS)의 개선 및 평가)

Kwon, Suk-Bong;Yun, Sung-Rack;Jang, Gyu-Cheol;Kim, Yong-Rae;Kim, Bong-Wan;Kim, Hoi-Rin;Yoo, Chang-Dong;Lee, Yong-Ju;Kwon, Oh-Wook
- MALSORI
- /
- no.59
- /
- pp.53-68
- /
- 2006
We report the evaluation results of the Korean speech recognition platform called ECHOS. The platform has an object-oriented and reusable architecture so that researchers can easily evaluate their own algorithms. The platform has all intrinsic modules to build a large vocabulary speech recognizer: Noise reduction, end-point detection, feature extraction, hidden Markov model (HMM)-based acoustic modeling, cross-word modeling, n-gram language modeling, n-best search, word graph generation, and Korean-specific language processing. The platform supports both lexical search trees and finite-state networks. It performs word-dependent n-best search with bigram in the forward search stage, and rescores the lattice with trigram in the backward stage. In an 8000-word continuous speech recognition task, the platform with a lexical tree increases 40% of word errors but decreases 50% of recognition time compared to the HTK platform with flat lexicon. ECHOS reduces 40% of recognition errors through incorporation of cross-word modeling. With the number of Gaussian mixtures increasing to 16, it yields word accuracy comparable to the previous lexical tree-based platform, Julius.
PDF

Analysis of Research Trends in Journal of Korean Society for Quality Management by Text Mining Processing (텍스트 마이닝 처리로 품질경영학회지 연구동향 분석)

Ree, Sangbok
- Journal of Korean Society for Quality Management
- /
- v.47 no.3
- /
- pp.597-613
- /
- 2019
Purpose: The purpose of this study is to analyze the trend of quality research by analyzing the entire JKSQM(Journal of the Korean Society for Quality Management). Methods: This study is to analyze the frequency of words used in the abstract of the all JKSQM by applying the text mining processing. We use wordcrowd among text mining techniques. Results: 22 words of high frequency were presented in the abstract of the paper published in the JKSQM for 42 years. The frequency of words was shown on a 10 year basis, and the four important words were plotted on a change graph for each Vol. Frequent words of each Vol. are added in the appendix. Conclusion: The main research results are as follows. First, there has been no significant change in research trends over the last 40 years. Second, the early SQC words have been widely used, and since 1990, many words such as service-oriented words have been used, indicating a change in the times. Third, the use of the words of the 4th industrial revolution since 2010 is weak. In the above analysis, the trend of quality research in Korea is within the quality category and can be considered conservative. Now, it is expected that everything will be changed in the period of the 4th Industrial Revolution, and it is time to study the direction of quality in Korea.
https://doi.org/10.7469/JKSQM.2019.47.3.597 인용 PDF KSCI

Search Result 98, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)