• 제목/요약/키워드: Big data storage

Search Result 206, Processing Time 0.024 seconds

Big Data Management in Structured Storage Based on Fintech Models for IoMT using Machine Learning Techniques (기계학습법을 이용한 IoMT 핀테크 모델을 기반으로 한 구조화 스토리지에서의 빅데이터 관리 연구)

  • Kim, Kyung-Sil
    • Advanced Industrial SCIence
    • /
    • v.1 no.1
    • /
    • pp.7-15
    • /
    • 2022
  • To adopt the development in the medical scenario IoT developed towards the advancement with the processing of a large amount of medical data defined as an Internet of Medical Things (IoMT). The vast range of collected medical data is stored in the cloud in the structured manner to process the collected healthcare data. However, it is difficult to handle the huge volume of the healthcare data so it is necessary to develop an appropriate scheme for the healthcare structured data. In this paper, a machine learning mode for processing the structured heath care data collected from the IoMT is suggested. To process the vast range of healthcare data, this paper proposed an MTGPLSTM model for the processing of the medical data. The proposed model integrates the linear regression model for the processing of healthcare information. With the developed model outlier model is implemented based on the FinTech model for the evaluation and prediction of the COVID-19 healthcare dataset collected from the IoMT. The proposed MTGPLSTM model comprises of the regression model to predict and evaluate the planning scheme for the prevention of the infection spreading. The developed model performance is evaluated based on the consideration of the different classifiers such as LR, SVR, RFR, LSTM and the proposed MTGPLSTM model and the different size of data as 1GB, 2GB and 3GB is mainly concerned. The comparative analysis expressed that the proposed MTGPLSTM model achieves ~4% reduced MAPE and RMSE value for the worldwide data; in case of china minimal MAPE value of 0.97 is achieved which is ~ 6% minimal than the existing classifier leads.

Anomaly Detection Technique of Log Data Using Hadoop Ecosystem (하둡 에코시스템을 활용한 로그 데이터의 이상 탐지 기법)

  • Son, Siwoon;Gil, Myeong-Seon;Moon, Yang-Sae
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.2
    • /
    • pp.128-133
    • /
    • 2017
  • In recent years, the number of systems for the analysis of large volumes of data is increasing. Hadoop, a representative big data system, stores and processes the large data in the distributed environment of multiple servers, where system-resource management is very important. The authors attempted to detect anomalies from the rapid changing of the log data that are collected from the multiple servers using simple but efficient anomaly-detection techniques. Accordingly, an Apache Hive storage architecture was designed to store the log data that were collected from the multiple servers in the Hadoop ecosystem. Also, three anomaly-detection techniques were designed based on the moving-average and 3-sigma concepts. It was finally confirmed that all three of the techniques detected the abnormal intervals correctly, while the weighted anomaly-detection technique is more precise than the basic techniques. These results show an excellent approach for the detection of log-data anomalies with the use of simple techniques in the Hadoop ecosystem.

Analysis of Factors for Korean Women's Cancer Screening through Hadoop-Based Public Medical Information Big Data Analysis (Hadoop기반의 공개의료정보 빅 데이터 분석을 통한 한국여성암 검진 요인분석 서비스)

  • Park, Min-hee;Cho, Young-bok;Kim, So Young;Park, Jong-bae;Park, Jong-hyock
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.10
    • /
    • pp.1277-1286
    • /
    • 2018
  • In this paper, we provide flexible scalability of computing resources in cloud environment and Apache Hadoop based cloud environment for analysis of public medical information big data. In fact, it includes the ability to quickly and flexibly extend storage, memory, and other resources in a situation where log data accumulates or grows over time. In addition, when real-time analysis of accumulated unstructured log data is required, the system adopts Hadoop-based analysis module to overcome the processing limit of existing analysis tools. Therefore, it provides a function to perform parallel distributed processing of a large amount of log data quickly and reliably. Perform frequency analysis and chi-square test for big data analysis. In addition, multivariate logistic regression analysis of significance level 0.05 and multivariate logistic regression analysis of meaningful variables (p<0.05) were performed. Multivariate logistic regression analysis was performed for each model 3.

Design of Cloud Service Platform for eGovernment

  • LEE, Choong Hyong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.201-209
    • /
    • 2021
  • The term, eGovernmen or e-Government, uses technology communications devices such as computers and the Internet to provide public services to citizens and others. The eGovernment or e-government provides citizens with new opportunities to access the government directly and conveniently, while the government provides citizens with directservices. Also, in these days, cloud computing is a feature that enables users to use computer system resources, especially data storage (cloud storage) and on-demand computing power, without having to manage themselves. The term is commonly used to describe data centers that are available to many users over the Internet. Today, the dominant Big Cloud is distributed across multiple central servers. You can designate it as an Edge server if it is relatively close to the user. However, despite the prevalence of e-government and cloud computing, each of these concepts has evolved. Research attempts to combine these two concepts were not being made properly. For this reason, in this work, we aim to produce independent and objective analysis results by separating progress steps for the analysis of e-government cloud service platforms. This work will be done through an analysis of the development process and architectural composition of the e-government development standard framework and the cloud platform PaaS-TA. In addition, this study is expected to derive implications from an analysis perspective on the direction and service composition of the e-government cloud service platform currently being pursued.

The Security Architecture for Secure Cloud Computing Environment

  • Choi, Sang-Yong;Jeong, Kimoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.12
    • /
    • pp.81-87
    • /
    • 2018
  • Cloud computing is a computing environment in which users borrow as many IT resources as they need to, and use them over the network at any point in time. This is the concept of leasing and using as many IT resources as needed to lower IT resource usage costs and increase efficiency. Recently, cloud computing is emerging to provide stable service and volume of data along with major technological developments such as the Internet of Things, artificial intelligence and big data. However, for a more secure cloud environment, the importance of perimeter security such as shared resources and resulting secure data storage and access control is growing. This paper analyzes security threats in cloud computing environments and proposes a security architecture for effective response.

The Security and Privacy Issues of Fog Computing

  • Sultan Algarni;Khalid Almarhabi;Ahmed M. Alghamdi;Asem Alradadi
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.4
    • /
    • pp.25-31
    • /
    • 2023
  • Fog computing diversifies cloud computing by using edge devices to provide computing, data storage, communication, management, and control services. As it has a decentralised infrastructure that is capable of amalgamating with cloud computing as well as providing real-time data analysis, it is an emerging method of using multidisciplinary domains for a variety of applications; such as the IoT, Big Data, and smart cities. This present study provides an overview of the security and privacy concerns of fog computing. It also examines its fundamentals and architecture as well as the current trends, challenges, and potential methods of overcoming issues in fog computing.

Efficient Authentication of Aggregation Queries for Outsourced Databases (아웃소싱 데이터베이스에서 집계 질의를 위한 효율적인 인증 기법)

  • Shin, Jongmin;Shim, Kyuseok
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.703-709
    • /
    • 2017
  • Outsourcing databases is to offload storage and computationally intensive tasks to the third party server. Therefore, data owners can manage big data, and handle queries from clients, without building a costly infrastructure. However, because of the insecurity of network systems, the third-party server may be untrusted, thus the query results from the server may be tampered with. This problem has motivated significant research efforts on authenticating various queries such as range query, kNN query, function query, etc. Although aggregation queries play a key role in analyzing big data, authenticating aggregation queries has not been extensively studied, and the previous works are not efficient for data with high dimension or a large number of distinct values. In this paper, we propose the AMR-tree that is a data structure, applied to authenticate aggregation queries. We also propose an efficient proof construction method and a verification method with the AMR-tree. Furthermore, we validate the performance of the proposed algorithm by conducting various experiments through changing parameters such as the number of distinct values, the number of records, and the dimension of data.

A Study on the Compression and Major Pattern Extraction Method of Origin-Destination Data with Principal Component Analysis (주성분분석을 이용한 기종점 데이터의 압축 및 주요 패턴 도출에 관한 연구)

  • Kim, Jeongyun;Tak, Sehyun;Yoon, Jinwon;Yeo, Hwasoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.4
    • /
    • pp.81-99
    • /
    • 2020
  • Origin-destination data have been collected and utilized for demand analysis and service design in various fields such as public transportation and traffic operation. As the utilization of big data becomes important, there are increasing needs to store raw origin-destination data for big data analysis. However, it is not practical to store and analyze the raw data for a long period of time since the size of the data increases by the power of the number of the collection points. To overcome this storage limitation and long-period pattern analysis, this study proposes a methodology for compression and origin-destination data analysis with the compressed data. The proposed methodology is applied to public transit data of Sejong and Seoul. We first measure the reconstruction error and the data size for each truncated matrix. Then, to determine a range of principal components for removing random data, we measure the level of the regularity based on covariance coefficients of the demand data reconstructed with each range of principal components. Based on the distribution of the covariance coefficients, we found the range of principal components that covers the regular demand. The ranges are determined as 1~60 and 1~80 for Sejong and Seoul respectively.

Development of Artificial Intelligence-based Legal Counseling Chatbot System

  • Park, Koo-Rack
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.3
    • /
    • pp.29-34
    • /
    • 2021
  • With the advent of the 4th industrial revolution era, IT technology is creating new services that have not existed by converging with various existing industries and fields. In particular, in the field of artificial intelligence, chatbots and the latest technologies have developed dramatically with the development of natural language processing technology, and various business processes are processed through chatbots. This study is a study on a system that provides a close answer to the question the user wants to find by creating a structural form for legal inquiries through Slot Filling-based chatbot technology, and inputting a predetermined type of question. Using the proposal system, it is possible to construct question-and-answer data in a more structured form of legal information, which is unstructured data in text form. In addition, by managing the accumulated Q&A data through a big data storage system such as Apache Hive and recycling the data for learning, the reliability of the response can be expected to continuously improve.

Secure and Efficient Client-side Deduplication for Cloud Storage (안전하고 효율적인 클라이언트 사이드 중복 제거 기술)

  • Park, Kyungsu;Eom, Ji Eun;Park, Jeongsu;Lee, Dong Hoon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.1
    • /
    • pp.83-94
    • /
    • 2015
  • Deduplication, which is a technique of eliminating redundant data by storing only a single copy of each data, provides clients and a cloud server with efficiency for managing stored data. Since the data is saved in untrusted public cloud server, however, both invasion of data privacy and data loss can be occurred. Over recent years, although many studies have been proposed secure deduplication schemes, there still remains both the security problems causing serious damages and inefficiency. In this paper, we propose secure and efficient client-side deduplication with Key-server based on Bellare et. al's scheme and challenge-response method. Furthermore, we point out potential risks of client-side deduplication and show that our scheme is secure against various attacks and provides high efficiency for uploading big size of data.