• Title/Summary/Keyword: Log Analysis System

Search Result 560, Processing Time 0.029 seconds

Utilization of Log Data Reflecting User Information-Seeking Behavior in the Digital Library

  • Lee, Seonhee;Lee, Jee Yeon
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.1
    • /
    • pp.73-88
    • /
    • 2022
  • This exploratory study aims to understand the potential of log data analysis and expand its utilization in user research methods. Transaction log data are records of electronic interactions that have occurred between users and web services, reflecting information-seeking behavior in the context of digital libraries where users interact with the service system during the search for information. Two ways were used to analyze South Korea's National Digital Science Library (NDSL) log data for three days, including 150,000 data: a log pattern analysis, and log context analysis using statistics. First, a pattern-based analysis examined the general paths of usage by logged and unlogged users. The correlation between paths was analyzed through a χ2 analysis. The subsequent log context analysis assessed 30 identified users' data using basic statistics and visualized the individual user information-seeking behavior while accessing NDSL. The visualization shows included 30 diverse paths for 30 cases. Log analysis provided insight into general and individual user information-seeking behavior. The results of log analysis can enhance the understanding of user actions. Therefore, it can be utilized as the basic data to improve the design of services and systems in the digital library to meet users' needs.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.

An Efficient Design and Implementation of an MdbULPS in a Cloud-Computing Environment

  • Kim, Myoungjin;Cui, Yun;Lee, Hanku
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.8
    • /
    • pp.3182-3202
    • /
    • 2015
  • Flexibly expanding the storage capacity required to process a large amount of rapidly increasing unstructured log data is difficult in a conventional computing environment. In addition, implementing a log processing system providing features that categorize and analyze unstructured log data is extremely difficult. To overcome such limitations, we propose and design a MongoDB-based unstructured log processing system (MdbULPS) for collecting, categorizing, and analyzing log data generated from banks. The proposed system includes a Hadoop-based analysis module for reliable parallel-distributed processing of massive log data. Furthermore, because the Hadoop distributed file system (HDFS) stores data by generating replicas of collected log data in block units, the proposed system offers automatic system recovery against system failures and data loss. Finally, by establishing a distributed database using the NoSQL-based MongoDB, the proposed system provides methods of effectively processing unstructured log data. To evaluate the proposed system, we conducted three different performance tests on a local test bed including twelve nodes: comparing our system with a MySQL-based approach, comparing it with an Hbase-based approach, and changing the chunk size option. From the experiments, we found that our system showed better performance in processing unstructured log data.

Log Analysis System Design using RTMA

  • Park, Hee-Chang;Myung, Ho-Min
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2004.04a
    • /
    • pp.225-236
    • /
    • 2004
  • Every web server comprises a repository of all actions and events that occur on the server. Server logs can be used to quantify user traffic. Intelligent analysis of this data provides a statistical baseline that can be used to determine server load, failed requests and other events that throw light on site usage patterns. This information provides valuable leads on marketing and site management activities. In this paper, we propose a method of design for log analysis system using RTMA(realtime monitoring and analysis) technique.

  • PDF

Methodology of Log Analysis for Intrusion Prevention based on LINUX (리눅스 기반 침입 방지를 위한 로그 분석 방법 연구)

  • Lim, Sung-Hwa;Lee, Do Hyeon;Kim, Jeom Goo
    • Convergence Security Journal
    • /
    • v.15 no.2
    • /
    • pp.33-41
    • /
    • 2015
  • A safe Linux system for security enhancement should have an audit ability that prohibits an illegal access and alternation of data as well as trace ability of illegal activities. In addition, construction of the log management and monitoring system is a necessity to clearly categorize the responsibility of the system manager or administrator and the users' activities. In this paper, the Linux system's Security Log is analyzed to utilize it on prohibition and detection of an illegal protrusion converting the analyzed security log into a database. The proposed analysis allows a safe management of the security log. This system will contribute to the enhancement of the system reliability by allowing quick response to the system malfunctions.

Correlation Analysis of Event Logs for System Fault Detection (시스템 결함 분석을 위한 이벤트 로그 연관성에 관한 연구)

  • Park, Ju-Won;Kim, Eunhye;Yeom, Jaekeun;Kim, Sungho
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.2
    • /
    • pp.129-137
    • /
    • 2016
  • To identify the cause of the error and maintain the health of system, an administrator usually analyzes event log data since it contains useful information to infer the cause of the error. However, because today's systems are huge and complex, it is almost impossible for administrators to manually analyze event log files to identify the cause of an error. In particular, as OpenStack, which is being widely used as cloud management system, operates with various service modules being linked to multiple servers, it is hard to access each node and analyze event log messages for each service module in the case of an error. For this, in this paper, we propose a novel message-based log analysis method that enables the administrator to find the cause of an error quickly. Specifically, the proposed method 1) consolidates event log data generated from system level and application service level, 2) clusters the consolidated data based on messages, and 3) analyzes interrelations among message groups in order to promptly identify the cause of a system error. This study has great significance in the following three aspects. First, the root cause of the error can be identified by collecting event logs of both system level and application service level and analyzing interrelations among the logs. Second, administrators do not need to classify messages for training since unsupervised learning of event log messages is applied. Third, using Dynamic Time Warping, an algorithm for measuring similarity of dynamic patterns over time increases accuracy of analysis on patterns generated from distributed system in which time synchronization is not exactly consistent.

Messaging System Analysis for Effective Embedded Tester Log Processing (효과적인 Embedded Tester Log 처리를 위한 Messaging System 분석)

  • Nam, Ki-ahn;Kwon, Oh-young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.645-648
    • /
    • 2017
  • The existing embedded tester used TCP and shared file system for log processing. In addition, the existing processing method was treated as 1-N structure. This method wastes resources of the tester for exception handling. We implemented a log processing message layer that can be distributed by messaging system. And we compare the transmission method using the message layer and the transmission method using TCP and the shared file system. As a result of comparison, transmission using the message layer showed higher transmission bandwidth than TCP. In the CPU usage, the message layer showed lower efficiency than TCP, but showed no significant difference. It can be seen that the log processing using the message layer shows higher efficiency.

  • PDF

A Study on the Intrusion Detection Method using Firewall Log (방화벽 로그를 이용한 침입탐지기법 연구)

  • Yoon, Sung-Jong;Kim, Jeong-Ho
    • Journal of Information Technology Applications and Management
    • /
    • v.13 no.4
    • /
    • pp.141-153
    • /
    • 2006
  • According to supply of super high way internet service, importance of security becomes more emphasizing. Therefore, flawless security solution is needed for blocking information outflow when we send or receive data. large enterprise and public organizations can react to this problem, however, small organization with limited work force and capital can't. Therefore they need to elevate their level of information security by improving their information security system without additional money. No hackings can be done without passing invasion blocking system which installed at the very front of network. Therefore, if we manage.isolation log effective, we can recognize hacking trial at the step of pre-detection. In this paper, it supports information security manager to execute isolation log analysis very effectively. It also provides isolation log analysis module which notifies hacking attack by analyzing isolation log.

  • PDF

Development of App Analysis System and CMS System Open API (APP 분석 시스템 및 CMS시스템 오픈API 개발)

  • Kim, Sung Rim;Park, Hyeong Rok;Chun, Soojin
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.10 no.3
    • /
    • pp.23-33
    • /
    • 2014
  • The smart phone are changing the way people communicate. And, the mobile app marketplace is greatly fast-growing. The app store continues its rapid growth, there are already more than 900,000 mobile apps on AppStore. We anticipate to see gained momentum throughout the business. Mobile is also becoming popular for marketers. Therefore, specialized app analysis systems are becoming important to how marketers and app developers invest, analyze and market their apps. App analysis systems enable users to discover and analyze behavior through data observations and meaningful patterns. In this paper, we introduce app analysis system and CMS System Open API, NugaLog. The NugaLog acquires users data and engages with them in a variety of ways. It will be essential for us to understand how users interact with and move through the app. The NugaLog will be able to see the number of users, smart phone model, smart phone OS, resolution, page views, and app version.

Real time predictive analytic system design and implementation using Bigdata-log (빅데이터 로그를 이용한 실시간 예측분석시스템 설계 및 구현)

  • Lee, Sang-jun;Lee, Dong-hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.6
    • /
    • pp.1399-1410
    • /
    • 2015
  • Gartner is requiring companies to considerably change their survival paradigms insisting that companies need to understand and provide again the upcoming era of data competition. With the revealing of successful business cases through statistic algorithm-based predictive analytics, also, the conversion into preemptive countermeasure through predictive analysis from follow-up action through data analysis in the past is becoming a necessity of leading enterprises. This trend is influencing security analysis and log analysis and in reality, the cases regarding the application of the big data analysis framework to large-scale log analysis and intelligent and long-term security analysis are being reported file by file. But all the functions and techniques required for a big data log analysis system cannot be accommodated in a Hadoop-based big data platform, so independent platform-based big data log analysis products are still being provided to the market. This paper aims to suggest a framework, which is equipped with a real-time and non-real-time predictive analysis engine for these independent big data log analysis systems and can cope with cyber attack preemptively.