• Title/Summary/Keyword: Elasticsearch

Search Result 12, Processing Time 0.027 seconds

A Security Log Analysis System using Logstash based on Apache Elasticsearch (아파치 엘라스틱서치 기반 로그스태시를 이용한 보안로그 분석시스템)

  • Lee, Bong-Hwan;Yang, Dong-Min
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.2
    • /
    • pp.382-389
    • /
    • 2018
  • Recently cyber attacks can cause serious damage on various information systems. Log data analysis would be able to resolve this problem. Security log analysis system allows to cope with security risk properly by collecting, storing, and analyzing log data information. In this paper, a security log analysis system is designed and implemented in order to analyze security log data using the Logstash in the Elasticsearch, a distributed search engine which enables to collect and process various types of log data. The Kibana, an open source data visualization plugin for Elasticsearch, is used to generate log statistics and search report, and visualize the results. The performance of Elasticsearch-based security log analysis system is compared to the existing log analysis system which uses the Flume log collector, Flume HDFS sink and HBase. The experimental results show that the proposed system tremendously reduces both database query processing time and log data analysis time compared to the existing Hadoop-based log analysis system.

Improving Elasticsearch for Chinese, Japanese, and Korean Text Search through Language Detector

  • Kim, Ki-Ju;Cho, Young-Bok
    • Journal of information and communication convergence engineering
    • /
    • v.18 no.1
    • /
    • pp.33-38
    • /
    • 2020
  • Elasticsearch is an open source search and analytics engine that can search petabytes of data in near real time. It is designed as a distributed system horizontally scalable and highly available. It provides RESTful APIs, thereby making it programming-language agnostic. Full text search of multilingual text requires language-specific analyzers and field mappings appropriate for indexing and searching multilingual text. Additionally, a language detector can be used in conjunction with the analyzers to improve the multilingual text search. Elasticsearch provides more than 40 language analysis plugins that can process text and extract language-specific tokens and language detector plugins that can determine the language of the given text. This study investigates three different approaches to index and search Chinese, Japanese, and Korean (CJK) text (single analyzer, multi-fields, and language detector-based), and identifies the advantages of the language detector-based approach compared to the other two.

Performance Analysis of Real-Time Big Data Search Platform Based on High-Capacity Persistent Memory (대용량 영구 메모리 기반 실시간 빅데이터 검색 플랫폼 성능 분석)

  • Eunseo Lee;Dongchul Park
    • Journal of Platform Technology
    • /
    • v.11 no.4
    • /
    • pp.50-61
    • /
    • 2023
  • The advancement of various big data technologies has had a tremendous impact on many industries. Diverse big data research studies have been conducted to process and analyze massive data quickly. Under these circumstances, new emerging technologies such as high-capacity persistent memory (PMEM) and Compute Express Link (CXL) have lately attracted significant attention. However, little investigation into a big data "search" platform has been made. Moreover, most big data software platforms have been still optimized for traditional DRAM-based computing systems. This paper first evaluates the basic performance of Intel Optane PMEM, and then investigates both indexing and searching performance of Elasticsearch, a widely-known enterprise big data search platform, on the PMEM-based computing system to explore its effectiveness and possibility. Extensive and comprehensive experiments shows that the proposed Optane PMEM-based Elasticsearch achieves indexing and searching performance improvement by an average of 1.45 times and 3.2 times respectively compared to DRAM-based system. Consequently, this paper demonstrates the high I/O, high-capacity, and nonvolatile PMEM-based computing systems are very promising for big data search platforms.

  • PDF

Analysis and Visualization of Real Estate Market Price using Elasticsearch (Elasticsearch를 이용한 부동산 시장 가격 분석 및 시각화)

  • Seung-Yeon Hwang;Jeong-Joon Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.2
    • /
    • pp.185-190
    • /
    • 2024
  • In 2022, we can see the real estate market in Korea going down. Corona 19 and the Russian invasion of Ukraine are cited as the biggest causes for this. These two problems ignited the economic recession, causing prices to fall and subsequently raising exchange rates and interest rates. Due to the aforementioned problems in the previously active real estate market, the number of actual transactions has decreased, resulting in a decline in the real estate market due to high interest rates. Data provided by the public data portal, KOSIS, and the Seoul Metropolitan Government were collected through Logstash, transferred to Elasticsearch, and visualized inflation, exchange rates, and loan interest rates using the dashboard function provided by Kibana, to analyze causes and derive results. In addition, three specific apartments in Nowon-gu and Jongno-gu, which have the highest number of actual transactions in Seoul, are selected and the actual transaction prices that change every month are displayed in the Data Table.

Real-Time Indexing Performance Optimization of Search Platform Based on Big Data Cluster (빅데이터 클러스터 기반 검색 플랫폼의 실시간 인덱싱 성능 최적화)

  • Nayeon Keum;Dongchul Park
    • Journal of Platform Technology
    • /
    • v.11 no.6
    • /
    • pp.89-105
    • /
    • 2023
  • With the development of information technology, most of the information has been converted into digital information, leading to the Big Data era. The demand for search platform has increased to enhance accessibility and usability of information in the databases. Big data search software platforms consist of two main components: (1) an indexing component to generate and store data indices for a fast and efficient data search and (2) a searching component to look up the given data fast. As an amount of data has explosively increased, data indexing performance has become a key performance bottleneck of big data search platforms. Though many companies adopted big data search platforms, relatively little research has been made to improve indexing performance. This research study employs Elasticsearch platform, one of the most famous enterprise big data search platforms, and builds physical clusters of 3 nodes to investigate optimal indexing performance configurations. Our comprehensive experiments and studies demonstrate that the proposed optimal Elasticsearch configuration achieves high indexing performance by an average of 3.13 times.

  • PDF

Auto Configuration Module for Logstash in Elasticsearch Ecosystem

  • Ahmed, Hammad;Park, Yoosang;Choi, Jongsun;Choi, Jaeyoung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.39-42
    • /
    • 2018
  • Log analysis and monitoring have a significant importance in most of the systems. Log management has core importance in applications like distributed applications, cloud based applications, and applications designed for big data. These applications produce a large number of log files which contain essential information. This information can be used for log analytics to understand the relevant patterns from varying log data. However, they need some tools for the purpose of parsing, storing, and visualizing log informations. "Elasticsearch, Logstash, and Kibana"(ELK Stack) is one of the most popular analyzing tools for log management. For the ingestion of log files configuration files have a key importance, as they cover all the services needed to input, process, and output the log files. However, creating configuration files is sometimes very complicated and time consuming in many applications as it requires domain expertise and manual creation. In this paper, an auto configuration module for Logstash is proposed which aims to auto generate the configuration files for Logstash. The primary purpose of this paper is to provide a mechanism, which can be used to auto generate the configuration files for corresponding log files in less time. The proposed module aims to provide an overall efficiency in the log management system.

A Development of Optimal Travel Course Recommendation System based on Altered TSP and Elasticsearch Algorithm (변형된 TSP 및 엘라스틱서치 알고리즘 기반의 최적 여행지 코스 추천 시스템 개발)

  • Kim, Jun-Yeong;Jo, Kyeong-Ho;Park, Jun;Jung, Se-Hoon;Sim, Chun-Bo
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.9
    • /
    • pp.1108-1121
    • /
    • 2019
  • As the quality and level of life rise, many people are doing search for various pieces of information about tourism. In addition, users prefer the search methods reflecting individual opinions such as SNS and blogs to the official websites of tourist destination. Many of previous studies focused on a recommendation system for tourist courses based on the GPS information and past travel records of users, but such a system was not capable of recommending the latest tourist trends. This study thus set out to collect and analyze the latest SNS data to recommend tourist destination of high interest among users. It also aimed to propose an altered TSP algorithm to recommend the optimal routes to the recommended destination within an area and a system to recommend the optimal tourist courses by applying the Elasticsearch engine. The altered TSP algorithm proposed in the study used the location information of users instead of Dijkstra's algorithm technique used in previous studies to select a certain tourist destination and allowed users to check the recommended courses for the entire tourist destination within an area, thus offering more diverse tourist destination recommendations than previous studies.

Enhancement of Internal Network Security in Small Networks Using UTM and ELK Stack (UTM과 ELK Stack을 활용한 소규모 네트워크의 내부망 보안 강화방안)

  • Song Ha Min;DongHwi Lee
    • Convergence Security Journal
    • /
    • v.24 no.1
    • /
    • pp.3-9
    • /
    • 2024
  • Currently, cyberattacks and security threats are constantly evolving, and organizations need quick and efficient security response methods. This paper proposes ways to strengthen internal network security by utilizing Unified Threat Management (UTM) equipment to improve network security and effectively manage and analyze the log data of the internal network collected through these equipment using Elastic Stack (Elasticsearch, Logstash, Kibana, hereinafter referred to as ELK Stack).

Anomaly Detection Analysis using Repository based on Inverted Index (역방향 인덱스 기반의 저장소를 이용한 이상 탐지 분석)

  • Park, Jumi;Cho, Weduke;Kim, Kangseok
    • Journal of KIISE
    • /
    • v.45 no.3
    • /
    • pp.294-302
    • /
    • 2018
  • With the emergence of the new service industry due to the development of information and communication technology, cyber space risks such as personal information infringement and industrial confidentiality leakage have diversified, and the security problem has emerged as a critical issue. In this paper, we propose a behavior-based anomaly detection method that is suitable for real-time and large-volume data analysis technology. We show that the proposed detection method is superior to existing signature security countermeasures that are based on large-capacity user log data according to in-company personal information abuse and internal information leakage. As the proposed behavior-based anomaly detection method requires a technique for processing large amounts of data, a real-time search engine is used, called Elasticsearch, which is based on an inverted index. In addition, statistical based frequency analysis and preprocessing were performed for data analysis, and the DBSCAN algorithm, which is a density based clustering method, was applied to classify abnormal data with an example for easy analysis through visualization. Unlike the existing anomaly detection system, the proposed behavior-based anomaly detection technique is promising as it enables anomaly detection analysis without the need to set the threshold value separately, and was proposed from a statistical perspective.

Safety Autonomous Platform Design with Ensemble AI Models (앙상블 인공지능 모델을 활용한 안전 관리 자율운영 플랫폼 설계)

  • Dongyeop Lee;Daesik Lim;Soojeong Woo;Youngho Moon;Minjeong Kim;Joonwon Lee
    • Journal of Advanced Navigation Technology
    • /
    • v.28 no.1
    • /
    • pp.159-162
    • /
    • 2024
  • This paper proposes a novel safety autonomous platform (SAP) architecture that can automatically and precisely manage on-site safety through ensemble artificial intelligence models generated from video information, worker's biometric information, and the safety rule to estimate the risk index. We practically designed the proposed SAP architecture by the Hadoop ecosystem with Kafka/NiFi, Spark/Hive, Hue, ELK (Elasticsearch, Logstash, Kibana), Ansible, etc., and confirmed that it worked well with safety mobility gateways for providing various safety applications.