Search | Korea Science

Son, Siwoon;Gil, Myeong-Seon;Moon, Yang-Sae
- KIISE Transactions on Computing Practices
- /
- v.23 no.2
- /
- pp.128-133
- /
- 2017
In recent years, the number of systems for the analysis of large volumes of data is increasing. Hadoop, a representative big data system, stores and processes the large data in the distributed environment of multiple servers, where system-resource management is very important. The authors attempted to detect anomalies from the rapid changing of the log data that are collected from the multiple servers using simple but efficient anomaly-detection techniques. Accordingly, an Apache Hive storage architecture was designed to store the log data that were collected from the multiple servers in the Hadoop ecosystem. Also, three anomaly-detection techniques were designed based on the moving-average and 3-sigma concepts. It was finally confirmed that all three of the techniques detected the abnormal intervals correctly, while the weighted anomaly-detection technique is more precise than the basic techniques. These results show an excellent approach for the detection of log-data anomalies with the use of simple techniques in the Hadoop ecosystem.
https://doi.org/10.5626/KTCP.2017.23.2.128 인용 KSCI

Son, Siwoon;Gil, Myeong-Seon;Moon, Yang-Sae;Won, Hee-Sun
- KIPS Transactions on Software and Data Engineering
- /
- v.5 no.6
- /
- pp.283-288
- /
- 2016
In recent years, there have been many research efforts on Big Data, and many companies developed a variety of relevant products. Accordingly, we are able to store and analyze a large volume of log data, which have been difficult to be handled in the traditional computing environment. To handle a large volume of log data, which rapidly occur in multiple servers, in this paper we design a new data storage architecture to efficiently analyze those big log data through Apache Hive. We then design and implement anomaly detection methods, which identify abnormal status of servers from log data, based on moving average and 3-sigma techniques. We also show effectiveness of the proposed detection methods by demonstrating that our methods identifies anomalies correctly. These results show that our anomaly detection is an excellent approach for properly detecting anomalies from Hadoop log data.
https://doi.org/10.3745/KTSDE.2016.5.6.283 인용 PDF KSCI