• Title/Summary/Keyword: NoSQL 데이터베이스

Search Result 60, Processing Time 0.025 seconds

Design and implementation of a Large-Scale Security Log Collection System based on Hadoop Ecosystem (Hadoop Ecosystem 기반 대용량 보안로그 수집 시스템 설계 및 구축)

  • Lee, Jong-Yoon;Lee, Bong-Hwan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.461-463
    • /
    • 2014
  • 네트워크 공격이 다양해지고 빈번하게 발생함에 따라 이에 따라 해킹 공격의 유형을 파악하기 위해 다양한 보안 솔루션이 생겨났다. 그 중 하나인 통합보안관리시스템은 다양한 로그 관리와 분석을 통해 보안 정책을 세워 차후에 있을 공격에 대비할 수 있지만 기존 통합보안관리시스템은 대부분 관계형 데이터베이스의 사용으로 급격히 증가하는 데이터를 감당하지 못한다. 많은 정보를 가지는 로그데이터의 유실 방지 및 시스템 저하를 막기 위해 대용량의 로그 데이터를 처리하는 방식이 필요해짐에 따라 분산처리에 특화되어 있는 하둡 에코시스템을 이용하여 늘어나는 데이터에 따라 유연하게 대처할 수 있고 기존 NoSQL 로그 저장방식에서 나아가 로그 저장단계에서 정규화를 사용하여 처리, 저장 능력을 향상시켜 실시간 처리 및 저장, 확장성이 뛰어난 하둡 기반의 로그 수집 시스템을 제안하고자 한다.

Study of MongoDB Architecture by Data Complexity for Big Data Analysis System (빅데이터 분석 시스템 구현을 위한 데이터 구조의 복잡성에 따른 MongoDB 환경 구성 연구)

  • Hyeopgeon Lee;Young-Woon Kim;Jin-Woo Lee;Seong Hyun Lee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.5
    • /
    • pp.354-361
    • /
    • 2023
  • Big data analysis systems apply NoSQL databases like MongoDB to store, process, and analyze diverse forms of large-scale data. MongoDB offers scalability and fast data processing speeds through distributed processing and data replication, depending on its configuration. This paper investigates the suitable MongoDB environment configurations for implementing big data analysis systems. For performance evaluation, we configured both single-node and multi-node environments. In the multi-node setup, we expanded the number of data nodes from two to three and measured the performance in each environment. According to the analysis, the processing speeds for complex data structures with three or more dimensions are approximately 5.75% faster in the single-node environment compared to an environment with two data nodes. However, a setting with three data nodes processes data about 25.15% faster than the single-node environment. On the other hand, for simple one-dimensional data structures, the multi-node environment processes data approximately 28.63% faster than the single-node environment. Further research is needed to practically validate these findings with diverse data structures and large volumes of data.

A Study on Big Data Processing Technology Based on Open Source for Expansion of LIMS (실험실정보관리시스템의 확장을 위한 오픈 소스 기반의 빅데이터 처리 기술에 관한 연구)

  • Kim, Soon-Gohn
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.2
    • /
    • pp.161-167
    • /
    • 2021
  • Laboratory Information Management System(LIMS) is a centralized database for storing, processing, retrieving, and analyzing laboratory data, and refers to a computer system or system specially designed for laboratories performing inspection, analysis, and testing tasks. In particular, LIMS is equipped with a function to support the operation of the laboratory, and it requires workflow management or data tracking support. In this paper, we collect data on websites and various channels using crawling technology, one of the automated big data collection technologies for the operation of the laboratory. Among the collected test methods and contents, useful test methods and contents useful that the tester can utilize are recommended. In addition, we implement a complementary LIMS platform capable of verifying the collection channel by managing the feedback.

Development of Sensor Data Flow Detection and MQTT Simulation System to apply formalized Pattern Analysis (정형화된 패턴분석을 적용한 센서 데이터흐름 감지 및 MQTT 시뮬레이션 시스템 개발)

  • JongWon Cho;Hyeri Park;Fayzullayev mirjalol;Ryumduck Oh
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.131-134
    • /
    • 2024
  • 본 논문에서는 기존 철도 운영 및 관리시에 철도 주변환경으로 부터 발생하는 소음, 진동, 미세먼지 센서에서 다양한 실시간 스트림 데이터를 감지하고 정형화된된 데이터 패턴을 인식하고 분석할 수 있도록 데이터를 구성 및 저장하고 분석된 데이터를 표현할 수 있도록 시각화 지원을 위한 모니터링 시스템 플랫폼을 구현하였다. 데이터 전송을 위해 시리얼 통신 기법을 주로 적용하였으나, 센서와 디바이스의 증가로 인해 시리얼 통신의 한계가 나타났다. 따라서, 본 연구에서는 기존의 아두이노와 서버 간의 직접 통신 방식 대신 라즈베리파이를 도입하여 MQTT Broker(브로커)를 설치하고 통신을 진행하였다. 철도 데이터 모니터링 시스템 플랫폼은 NoSQL 데이터베이스인 MonGoDB와 데이터 시각화할 수 있는 Grafana를 이용하여 구축하였다.

  • PDF

Sorting Cuckoo: Enhancing Lookup Performance of Cuckoo Hashing Using Insertion Sort (Sorting Cuckoo: 삽입 정렬을 이용한 Cuckoo Hashing의 입력 연산의 성능 향상)

  • Min, Dae-hong;Jang, Rhong-ho;Nyang, Dae-hun;Lee, Kyung-hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.3
    • /
    • pp.566-576
    • /
    • 2017
  • Key-value stores proved its superiority by being applied to various NoSQL databases such as Redis, Memcached. Lookup performance is important because key-value store applications performs more lookup than insert operations in most environments. However, in traditional applications, lookup may be slow because hash tables are constructed out of linked-list. Therefore, cuckoo hashing has been getting attention from the academia for constant lookup time, and bucketized cuckoo hashing (BCH) has been proposed since it can achieve high load factor. In this paper, we introduce Sorting Cuckoo which inserts data using insertion sort in BCH structure. Sorting Cuckoo determines the existence of a key with a relatively small memory access because data are sorted in each buckets. In particular, the higher memory load factor, the better lookup performance than BCH's. Experimental results show that Sorting Cuckoo has smaller memory access than BCH's as many as about 19 million (25%) in 10 million negative lookup operations (key is not in the table), about 4 million times (10%) in 10 million positive lookup operations (where it is) with load factor 95%.

A Self-Service Business Intelligence System for Recommending New Crops (재배 작물 추천을 위한 셀프서비스 비즈니스 인텔리전스 시스템)

  • Kim, Sam-Keun;Kim, Kwang-Chae;Kim, Hyeon-Woo;Jeong, Woo-Jin;Ahn, Jae-Geun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.3
    • /
    • pp.527-535
    • /
    • 2021
  • Traditional business intelligence (BI) systems have been used widely as tools for better decision-making on time. On the other hand, building a data warehouse (DW) for the efficient analysis of rapidly growing data is time-consuming and complex. In particular, the ETL (Extract, Transform, and Load) process required to build a data warehouse has become much more complex as the BI platform moves to a cloud environment. Various BI solutions based on the NoSQL database, such as MongoDB, have been proposed to overcome these ETL issues. Decision-makers want easy access to data without the help of IT departments or BI experts. Recently, self-service BI (SSBI) has emerged as a way to solve these BI issues. This paper proposes a self-service BI system with farming data using the MongoDB cloud as DW to support the selection of new crops by return-farmers. The proposed system includes functions to provide insights to decision-makers, including data visualization using MongoDB charts, reporting for advanced data search, and monitoring for real-time data analysis. Decision makers can access data directly in various ways and can analyze data in a self-service method using the functions of the proposed system.

A Security Nonce Generation Algorithm Scheme Research for Improving Data Reliability and Anomaly Pattern Detection of Smart City Platform Data Management (스마트시티 플랫폼 데이터 운영의 이상패턴 탐지 및 데이터 신뢰성 향상을 위한 보안 난수 생성 알고리즘 방안 연구)

  • Lee, Jaekwan;Shin, Jinho;Joo, Yongjae;Noh, Jaekoo;Kim, Jae Do;Kim, Yongjoon;Jung, Namjoon
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.4 no.2
    • /
    • pp.75-80
    • /
    • 2018
  • The smart city is developing an energy system efficiently through a common management of the city resource for the growth and a low carbon social. However, the smart city doesn't counter a verification effectively about a anomaly pattern detection when existing security technology (authentication, integrity, confidentiality) is used by fixed security key and key deodorization according to generated big data. This paper is proposed the "security nonce generation based on security nonce generation" for anomaly pattern detection of the adversary and a safety of the key is high through the key generation of the KDC (Key Distribution Center; KDC) for improvement. The proposed scheme distributes the generated security nonce and authentication keys to each facilities system by the KDC. This proposed scheme can be enhanced to the security by doing the external pattern detection and changed new security key through distributed security nonce with keys. Therefore, this paper can do improving the security and a responsibility of the smart city platform management data through the anomaly pattern detection and the safety of the keys.

Multi-type and shape data meta management and dynamic user configurable interface method (다종다형 자료 메타 관리 및 사용자 동적 구성 가능한 검색 인터페이스 제공 방안)

  • Choi, Myungjin;Kim, Taeyoung;Lee, Minseob;Yang, Yunjung;Yoon, Kyoungwon;Kim, Moongi
    • Journal of Satellite, Information and Communications
    • /
    • v.12 no.1
    • /
    • pp.81-87
    • /
    • 2017
  • In this paper, we present the system that user can search and manage data using united interface and user can define search field dynamically. The feature of this system is that it is possible to manage multiple polymorphic meta information first. Second, there is a database integration bus that can support easy integration between the various systems. Third, it is possible to set the search item for each user which can customize polymorphism data for each user. The system studied in this paper is expected to be capable of managing big data, which is currently well received in the field of ICT. In addition, it will be possible to effectively manage multi-species polymorphic data in various fields in the future and to easily integrate between systems having various environments.

OHDSI OMOP-CDM Database Security Weakness and Countermeasures (OHDSI OMOP-CDM 데이터베이스 보안 취약점 및 대응방안)

  • Lee, Kyung-Hwan;Jang, Seong-Yong
    • Journal of Information Technology Services
    • /
    • v.21 no.4
    • /
    • pp.63-74
    • /
    • 2022
  • Globally researchers at medical institutions are actively sharing COHORT data of patients to develop vaccines and treatments to overcome the COVID-19 crisis. OMOP-CDM, a common data model that efficiently shares medical data research independently operated by individual medical institutions has patient personal information (e.g. PII, PHI). Although PII and PHI are managed and shared indistinguishably through de-identification or anonymization in medical institutions they could not be guaranteed at 100% by complete de-identification and anonymization. For this reason the security of the OMOP-CDM database is important but there is no detailed and specific OMOP-CDM security inspection tool so risk mitigation measures are being taken with a general security inspection tool. This study intends to study and present a model for implementing a tool to check the security vulnerability of OMOP-CDM by analyzing the security guidelines for the US database and security controls of the personal information protection of the NIST. Additionally it intends to verify the implementation feasibility by real field demonstration in an actual 3 hospitals environment. As a result of checking the security status of the test server and the CDM database of the three hospitals in operation, most of the database audit and encryption functions were found to be insufficient. Based on these inspection results it was applied to the optimization study of the complex and time-consuming CDM CSF developed in the "Development of Security Framework Required for CDM-based Distributed Research" task of the Korea Health Industry Promotion Agency. According to several recent newspaper articles, Ramsomware attacks on financially large hospitals are intensifying. Organizations that are currently operating or will operate CDM databases need to install database audits(proofing) and encryption (data protection) that are not provided by the OMOP-CDM database template to prevent attackers from compromising.

A Study on the Construction of Database, Online Management System, and Analysis Instrument for Biological Diversity Data (생물다양성 자료의 데이터베이스화와 온라인 관리시스템 및 분석도구 구축에 관한 연구)

  • Bec Kee-Yul;Jung Jong-Chul;Park Seon-Joo;Lee Jong-Wook
    • Journal of Environmental Science International
    • /
    • v.14 no.12
    • /
    • pp.1119-1127
    • /
    • 2005
  • The management of data on biological diversity is presently complex and confusing. This study was initiated to construct a database so that such data could be stored in a data management, and analysis instrument to correct the problems inherent in the current incoherent storage methods. MySQL was used in DBMS(DataBase Management System), and the program was basically produced using Java technology Also, the program was developed so people could adapt to the requirements that are changing every minute. We hope this was accomplished by modifying easily and quickly the advanced programming technology and patterns. To this end, an effective and flexible database schema was devised to store and analyze diversity databases. Even users with no knowledge of databases should be able to access this management instrument and easily manage the database through the World Wide Web. On a basis of databases stored in this manner, it could become routinely used for various databases using this analysis instrument supplied on the World Wide Web. Supplying the derived results by using a simple table and making results visible using simple charts, researchers could easily adapt these methods to various data analyses. As the diversity data was stored in a database, not in a general file, this study makes the precise, error-free and high -quality storage in a consistent manner. The methods proposed here should also minimize the errors that might appear in each data search, data movement, or data conversion by supplying management instrumentation on the Web. Also, this study was to deduce the various results to the level we required and execute the comparative analysis without the lengthy time necessary to supply the analytical instrument with similar results as provided by various other methods of analysis. The results of this research may be summerized as follows: 1)This study suggests methods of storage by giving consistency to diversity data. 2)This study prepared a suggested foundation for comparative analysis of various data. 3)It may suggest further research, which could lead to more and better standardization of diversity data and to better methods for predicting changes in species diversity.