• Title/Summary/Keyword: NoSQL Database System

Search Result 45, Processing Time 0.026 seconds

Classification of HTTP Automated Software Communication Behavior Using a NoSQL Database

  • Tran, Manh Cong;Nakamura, Yasuhiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.2
    • /
    • pp.94-99
    • /
    • 2016
  • Application layer attacks have for years posed an ever-serious threat to network security, since they always come after a technically legitimate connection has been established. In recent years, cyber criminals have turned to fully exploiting the web as a medium of communication to launch a variety of forbidden or illicit activities by spreading malicious automated software (auto-ware) such as adware, spyware, or bots. When this malicious auto-ware infects a network, it will act like a robot, mimic normal behavior of web access, and bypass the network firewall or intrusion detection system. Besides that, in a private and large network, with huge Hypertext Transfer Protocol (HTTP) traffic generated each day, communication behavior identification and classification of auto-ware is a challenge. In this paper, based on a previous study, analysis of auto-ware communication behavior, and with the addition of new features, a method for classification of HTTP auto-ware communication is proposed. For that, a Not Only Structured Query Language (NoSQL) database is applied to handle large volumes of unstructured HTTP requests captured every day. The method is tested with real HTTP traffic data collected through a proxy server of a private network, providing good results in the classification and detection of suspicious auto-ware web access.

A NoSQL data management infrastructure for bridge monitoring

  • Jeong, Seongwoon;Zhang, Yilan;O'Connor, Sean;Lynch, Jerome P.;Sohn, Hoon;Law, Kincho H.
    • Smart Structures and Systems
    • /
    • v.17 no.4
    • /
    • pp.669-690
    • /
    • 2016
  • Advances in sensor technologies have led to the instrumentation of sensor networks for bridge monitoring and management. For a dense sensor network, enormous amount of sensor data are collected. The data need to be managed, processed, and interpreted. Data management issues are of prime importance for a bridge management system. This paper describes a data management infrastructure for bridge monitoring applications. Specifically, NoSQL database systems such as MongoDB and Apache Cassandra are employed to handle time-series data as well the unstructured bridge information model data. Standard XML-based modeling languages such as OpenBrIM and SensorML are adopted to manage semantically meaningful data and to support interoperability. Data interoperability and integration among different components of a bridge monitoring system that includes on-site computers, a central server, local computing platforms, and mobile devices are illustrated. The data management framework is demonstrated using the data collected from the wireless sensor network installed on the Telegraph Road Bridge, Monroe, MI.

An Efficient Design and Implementation of an MdbULPS in a Cloud-Computing Environment

  • Kim, Myoungjin;Cui, Yun;Lee, Hanku
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.8
    • /
    • pp.3182-3202
    • /
    • 2015
  • Flexibly expanding the storage capacity required to process a large amount of rapidly increasing unstructured log data is difficult in a conventional computing environment. In addition, implementing a log processing system providing features that categorize and analyze unstructured log data is extremely difficult. To overcome such limitations, we propose and design a MongoDB-based unstructured log processing system (MdbULPS) for collecting, categorizing, and analyzing log data generated from banks. The proposed system includes a Hadoop-based analysis module for reliable parallel-distributed processing of massive log data. Furthermore, because the Hadoop distributed file system (HDFS) stores data by generating replicas of collected log data in block units, the proposed system offers automatic system recovery against system failures and data loss. Finally, by establishing a distributed database using the NoSQL-based MongoDB, the proposed system provides methods of effectively processing unstructured log data. To evaluate the proposed system, we conducted three different performance tests on a local test bed including twelve nodes: comparing our system with a MySQL-based approach, comparing it with an Hbase-based approach, and changing the chunk size option. From the experiments, we found that our system showed better performance in processing unstructured log data.

HBase based Business Process Event Log Schema Design of Hadoop Framework

  • Ham, Seonghun;Ahn, Hyun;Kim, Kwanghoon Pio
    • Journal of Internet Computing and Services
    • /
    • v.20 no.5
    • /
    • pp.49-55
    • /
    • 2019
  • Organizations design and operate business process models to achieve their goals efficiently and systematically. With the advancement of IT technology, the number of items that computer systems can participate in and the process becomes huge and complicated. This phenomenon created a more complex and subdivide flow of business process.The process instances that contain workcase and events are larger and have more data. This is an essential resource for process mining and is used directly in model discovery, analysis, and improvement of processes. This event log is getting bigger and broader, which leads to problems such as capacity management and I / O load in management of existing row level program or management through a relational database. In this paper, as the event log becomes big data, we have found the problem of management limit based on the existing original file or relational database. Design and apply schemes to archive and analyze large event logs through Hadoop, an open source distributed file system, and HBase, a NoSQL database system.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.

NVST DATA ARCHIVING SYSTEM BASED ON FASTBIT NOSQL DATABASE

  • Liu, Ying-Bo;Wang, Feng;Ji, Kai-Fan;Deng, Hui;Dai, Wei;Liang, Bo
    • Journal of The Korean Astronomical Society
    • /
    • v.47 no.3
    • /
    • pp.115-122
    • /
    • 2014
  • The New Vacuum Solar Telescope (NVST) is a 1-meter vacuum solar telescope that aims to observe the fine structures of active regions on the Sun. The main tasks of the NVST are high resolution imaging and spectral observations, including the measurements of the solar magnetic field. The NVST has been collecting more than 20 million FITS files since it began routine observations in 2012 and produces maximum observational records of 120 thousand files in a day. Given the large amount of files, the effective archiving and retrieval of files becomes a critical and urgent problem. In this study, we implement a new data archiving system for the NVST based on the Fastbit Not Only Structured Query Language (NoSQL) database. Comparing to the relational database (i.e., MySQL; My Structured Query Language), the Fastbit database manifests distinctive advantages on indexing and querying performance. In a large scale database of 40 million records, the multi-field combined query response time of Fastbit database is about 15 times faster and fully meets the requirements of the NVST. Our slestudy brings a new idea for massive astronomical data archiving and would contribute to the design of data management systems for other astronomical telescopes.

Implementation of query model of CQRS pattern using weather data (기상 데이터를 활용한 CQRS 패턴의 조회 모델 구현)

  • Seo, Bomin;Jeon, Cheolho;Jeon, Hyeonsig;An, Seyun;Park, Hyun-ju
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.6
    • /
    • pp.645-651
    • /
    • 2019
  • At a time when large amounts of data are being poured out, there are many changes in software architecture or data storage patterns because of the nature of the data being written, rather more read-intensive than writing. Accordingly, in this paper, the query model of Command Query Responsibility Segmentation (CQRS) pattern separating the responsibilities of commands and queries is used to implement an efficient high-capacity data lookup system in users' requirements. This paper uses the 2018 temperature, humidity and precipitation data of the Korea Meteorological Administration Open API to store about 2.3 billion data suitable for RDBMS (PostgreSQL) and NoSQL (MongoDB). It also compares and analyzes the performance of systems with CQRS pattern applied from the perspective of the web server (Web Server) implemented and systems without CQRS pattern, the storage structure performance of each database, and the performance corresponding to the data processing characteristics.

Technique of Range Query in Encrypted Database (암호화 데이터베이스에서 영역 질의를 위한 기술)

  • Kim, Cheon-Shik;Kim, Hyoung-Joong;Hong, You-Sik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.3
    • /
    • pp.22-30
    • /
    • 2008
  • Recently, protection of personal information is getting more important. Many countries have legislated about the protection of personal information. Now, the protection of relevant personal information is required not for a simple image of enterprises but law obligation. Most databases in enterprises used to store customers' names, addresses and credit card numbers with no exceptions. The personal information about a person is sensitive, and this asset is strategic. Therefore, most enterprises make an effort to preserve personal information safely. If someone, however, hacks password information of DBMS manager, no one can trust this system. Therefore, encryption is required based in order to protect data in the database. Because of database encryption, however, it is the problem of database performance in terms of computation time and the limited SQL query. Thus, we proposed an efficient query method to solve the problem of encrypted data in this paper.

Development of educational programs for managing medical information utilizing medical data generation and analysis techniques (의료 데이터 발생과 분석기술을 활용한 의료정보관리 교육용 프로그램 개발)

  • Choi, Joonyoung
    • Journal of Digital Convergence
    • /
    • v.15 no.10
    • /
    • pp.377-386
    • /
    • 2017
  • This study has developed a medical information management educational program that can improve the management ability of medical information. The educational medical information management program was developed for 8mnths uing VB. The database utilized the ACCESS Database, which allows learners to easily understand and understand the structure of the data. The learners enter data in the discharge analysis and the cancer registration program and the incomplete program after analyze the medical records. After entering and saving data, medical information management programs can be used to understand and analyze the structure of the database to generate medical information. The educational programs can improve the ability of learners to manage medical information by extracting the necessary data from the database directly through SQL and creating various medical information. However, although the medical information management program is an educational program, there is no evaluation system for the learners program operation. Accordingly, the next studies should develop the assessment system of the medical information management program for learners evaluation.

An Implementation of Web-based Client/Server Architecture using Distributed Objects (분산 객체를 이용한 웹기반 클라이언트 / 서버 구조의 구현)

  • 박희창;이태공
    • Journal of the military operations research society of Korea
    • /
    • v.23 no.2
    • /
    • pp.25-44
    • /
    • 1997
  • Internet users been rapidly increased due to the convenient GUI environment. Current Web-based HTTP/CGI client/server architecture has several problems such as the CGI bottleneck, no maintaince of state, and no load balancing. However, with Java and CORBA technologies called“Object Web technology”, we can solve them because Java is not only a mobile code but also a platform-independent code, and CORBA has ability to build distributed object and language-independent object model. The goal of “Object Web technology”is to create multivendor, multiOS, multilanguage“legoware”using objects. This paper implement“Book Search System”which is Web-based client/server architecture using distributed objects. Environments of this implementation are Hangul Windows NT(included IIS) server, Hangul Windows 95 client, Visigenic's VisiBroker for Java 1.2 which is a product of CORBA 2.0, HTTP protocol on TCP-IP-based, Sybase SQL Anywhere 5.0 database server, and the interface between application server and database is JDBC-ODBC bridge middleware.

  • PDF