• 제목/요약/키워드: 빅데이터의 처리 및 분석기법

Search Result 115, Processing Time 0.028 seconds

An Implementation of Federated Learning based on Blockchain (블록체인 기반의 연합학습 구현)

  • Park, June Beom;Park, Jong Sou
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.89-96
    • /
    • 2020
  • Deep learning using an artificial neural network has been recently researched and developed in various fields such as image recognition, big data and data analysis. However, federated learning has emerged to solve issues of data privacy invasion and problems that increase the cost and time required to learn. Federated learning presented learning techniques that would bring the benefits of distributed processing system while solving the problems of existing deep learning, but there were still problems with server-client system and motivations for providing learning data. So, we replaced the role of the server with a blockchain system in federated learning, and conducted research to solve the privacy and security problems that are associated with federated learning. In addition, we have implemented a blockchain-based system that motivates users by paying compensation for data provided by users, and requires less maintenance costs while maintaining the same accuracy as existing learning. In this paper, we present the experimental results to show the validity of the blockchain-based system, and compare the results of the existing federated learning with the blockchain-based federated learning. In addition, as a future study, we ended the thesis by presenting solutions to security problems and applicable business fields.

Consideration for Improving the Vulnerability of the Cloud Hypervisor Architecture (클라우드 하이퍼바이저 구조의 취약점 개선을 위한 고찰)

  • Kim, Tae Woo;Suk, Sang Kee;Park, Jong Hyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.238-241
    • /
    • 2020
  • 클라우드 컴퓨팅 (Cloud Computing)은 언제 어디서든 인터넷을 통하여 필요한 컴퓨팅 자원을 원하는 시간만큼 활용할 수 있는 최신 컴퓨팅 방식으로 사용자에게 효율적인 컴퓨팅 자원을 제공한다. 또한 빅데이터 및 인공지능 분야에서의 활용도가 높아 4차 산업혁명의 기초 인프라로 부각되고 있다. 클라우드의 독립적인 컴퓨팅 자원을 하이퍼바이저 (Hypervisor)를 통해 효율적으로 관리한다. 본 논문에서는 클라우드 하이퍼바이저에 대한 공격 기법인 커널 기반 루트킷, 캐시 기반 부 채널 공격, ROP (Return oriented Programming) 공격의 공격 방법과 대응 방안을 분석한다. 이후 기존에 연구된 하이퍼바이저 보안을 위한 클라우드 컴퓨팅 아키텍처를 소개하고, 하이퍼바이저 구조의 취약점에 대해 고찰한다. 마지막으로 하이퍼바이저 기반 클라우드 컴퓨팅 아키텍처의 문제점과 해결방안을 고찰한다.

Graph Database Benchmarking Systems Supporting Diversity (다양성을 지원하는 그래프 데이터베이스 벤치마킹 시스템)

  • Choi, Do-Jin;Baek, Yeon-Hee;Lee, So-Min;Kim, Yun-A;Kim, Nam-Young;Choi, Jae-Young;Lee, Hyeon-Byeong;Lim, Jong-Tae;Bok, Kyoung-Soo;Song, Seok-Il;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.12
    • /
    • pp.84-94
    • /
    • 2021
  • Graph databases have been developed to efficiently store and query graph data composed of vertices and edges to express relationships between objects. Since the query types of graph database show very different characteristics from traditional NoSQL databases, benchmarking tools suitable for graph databases to verify the performance of the graph database are needed. In this paper, we propose an efficient graph database benchmarking system that supports diversity in graph inputs and queries. The proposed system utilizes OrientDB to conduct benchmarking for graph databases. In order to support the diversity of input graphs and query graphs, we use LDBC that is an existing graph data generation tool. We demonstrate the feasibility and effectiveness of the proposed scheme through analysis of benchmarking results. As a result of performance evaluation, it has been shown that the proposed system can generate customizable synthetic graph data, and benchmarking can be performed based on the generated graph data.

A Study on Intelligent Skin Image Identification From Social media big data

  • Kim, Hyung-Hoon;Cho, Jeong-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.191-203
    • /
    • 2022
  • In this paper, we developed a system that intelligently identifies skin image data from big data collected from social media Instagram and extracts standardized skin sample data for skin condition diagnosis and management. The system proposed in this paper consists of big data collection and analysis stage, skin image analysis stage, training data preparation stage, artificial neural network training stage, and skin image identification stage. In the big data collection and analysis stage, big data is collected from Instagram and image information for skin condition diagnosis and management is stored as an analysis result. In the skin image analysis stage, the evaluation and analysis results of the skin image are obtained using a traditional image processing technique. In the training data preparation stage, the training data were prepared by extracting the skin sample data from the skin image analysis result. And in the artificial neural network training stage, an artificial neural network AnnSampleSkin that intelligently predicts the skin image type using this training data was built up, and the model was completed through training. In the skin image identification step, skin samples are extracted from images collected from social media, and the image type prediction results of the trained artificial neural network AnnSampleSkin are integrated to intelligently identify the final skin image type. The skin image identification method proposed in this paper shows explain high skin image identification accuracy of about 92% or more, and can provide standardized skin sample image big data. The extracted skin sample set is expected to be used as standardized skin image data that is very efficient and useful for diagnosing and managing skin conditions.

A Realtime Malware Detection Technique Using Multiple Filter (다중 필터를 이용한 실시간 악성코드 탐지 기법)

  • Park, Jae-Kyung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.7
    • /
    • pp.77-85
    • /
    • 2014
  • Recently, several environment damage caused by malicious or suspicious code is increasing. We study comprehensive response system actively for malware detection. Suspicious code is installed on your PC without your consent, users are unaware of the damage. Also, there are need to technology for realtime processing of Big Data. We must develope advanced technology for malware detection. We must analyze the static, dynamic of executable file for fundamentally malware detection in recently and verified by a reputation for verification. It is need to judgment of similarity for realtime response with big data. In this paper, we proposed realtime detection and verification technology using multiple filter. Our malware study suggests a new direction of realtime malware detection.

Asymmetric Index Management Scheme for High-capacity Compressed Databases (대용량 압축 데이터베이스를 위한 비대칭 색인 관리 기법)

  • Byun, Si-Woo;Jang, Seok-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.7
    • /
    • pp.293-300
    • /
    • 2016
  • Traditional databases exploit a record-based model, where the attributes of a record are placed contiguously in a slow hard disk to achieve high performance. On the other hand, for read-intensive data analysis systems, the column-based compressed database has become a proper model because of its superior read performance. Currently, flash memory SSD is largely recognized as the preferred storage media for high-speed analysis systems. This paper introduces a compressed column-storage model and proposes a new index and its data management scheme for a high-capacity data warehouse system. The proposed index management scheme is based on the asymmetric index duplication and achieves superior search performance using the master index and compact index, particularly for large read-mostly databases. In addition, the data management scheme contributes to the read performance and high reliability by compressing the related columns and replicating them in two mirrored SSD. Based on the results of the performance evaluation under the high workload conditions, the data management scheme outperforms the traditional scheme in terms of the search throughput and response time.

국제 개인정보보호 표준화 동향 분석 (2017년 4월 해밀턴 SC27 회의 결과를 중심으로)

  • Youm, Heung-Youl
    • Review of KIISC
    • /
    • v.27 no.5
    • /
    • pp.43-48
    • /
    • 2017
  • 개인정보관리체계 [1,2] 를 구축하기 위해서는 관리체계를 위한 요구사항과 프라이버시 통제가 필요하다. 국내에서 시행되고 있는 개인정보관리체계도 요구사항과 개인정보 전주기동안의 프라이버시 보호조치에 근거해 시행하고 있다. 빅데이터 환경에서는 개인정보를 처리하기 위한 비식별화 기법(de-identification technique)이 요구된다. 그리고 온라인 사용자 친화적 고지 및 통보 방법이 필요하다. 국제표준화위원회/전기위원회 합동위원회 1의 정보보호기술연구반 신원 관리 및 프라이버시 작업반 (ISO/IEC JTC 1/SC 27/WG 5)에서는 개인정보보호를 위한 여러 가지 국제표준을 개발하고 있다[20],[21],[22],[30]. 본 논문에서는 작업반 5에서 2017년 4월 뉴질랜드 해밀턴 SC27 회의에서 논의된 개인정보보호 관련 주요 표준화 이슈와 대응 방안을 제시한다.

Systematic Research on Privacy-Preserving Distributed Machine Learning (프라이버시를 보호하는 분산 기계 학습 연구 동향)

  • Min Seob Lee;Young Ah Shin;Ji Young Chun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.2
    • /
    • pp.76-90
    • /
    • 2024
  • Although artificial intelligence (AI) can be utilized in various domains such as smart city, healthcare, it is limited due to concerns about the exposure of personal and sensitive information. In response, the concept of distributed machine learning has emerged, wherein learning occurs locally before training a global model, mitigating the concentration of data on a central server. However, overall learning phase in a collaborative way among multiple participants poses threats to data privacy. In this paper, we systematically analyzes recent trends in privacy protection within the realm of distributed machine learning, considering factors such as the presence of a central server, distribution environment of the training datasets, and performance variations among participants. In particular, we focus on key distributed machine learning techniques, including horizontal federated learning, vertical federated learning, and swarm learning. We examine privacy protection mechanisms within these techniques and explores potential directions for future research.

Abnormal Situation Analysis of Railway Point Machine Using Data Cube and OLAP (Data cube와 OLAP기법을 이용한 철도 선로전환기의 이상상황 분석)

  • Choi, Heesu;Xu, Zhenshun;Lim, Chulhoo;Park, Daihee;Chung, Yongwha;Kim, Heeyoung;Yoon, Sukhan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.04a
    • /
    • pp.558-561
    • /
    • 2016
  • 선로전환기는 분기기에서 철도의 궤도를 변경하는 핵심장치 중 하나로서, 해당 부품의 고장은 열차사고에 직접적인 영향을 미친다. 현재 철도 현장에서는 관리자가 모니터링 시스템을 통해 선로전환기의 장애 및 이상상황을 감시하고 지침서에 따라 관리를 수행한다. 본 논문에서는 실제 현장에서 발생하는 대규모의 선로전환기 이상상황 데이터를 대상으로 빅 데이터 해석학적 입장에서 심층 분석이 가능한 새로운 철도 유지보수 분석 시스템의 프로토타입을 제안한다. 제안하는 시스템은 첫째, 유지관리시스템에 저장된 선로전환기 데이터와 이상상황 데이터를 정규화하고 추출하여 베이스 테이블을 생성한다. 둘째, 베이스 테이블 상의 속성들을 스타 스키마로 설계하여 철도 유지보수 큐브로 구축한다. 마지막으로, 매핑된 철도 유지보수 큐브와 오라클에서 제공하는 AWM을 활용해 다차원적이고 심층적인 OLAP(On-Line Analytical Processing) 분석이 가능하다.

The Method of Analyzing Firewall Log Data using MapReduce based on NoSQL (NoSQL기반의 MapReduce를 이용한 방화벽 로그 분석 기법)

  • Choi, Bomin;Kong, Jong-Hwan;Hong, Sung-Sam;Han, Myung-Mook
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.4
    • /
    • pp.667-677
    • /
    • 2013
  • As the firewall is a typical network security equipment, it is usually installed at most of internal/external networks and makes many packet data in/out. So analyzing the its logs stored in it can provide important and fundamental data on the network security research. However, along with development of communications technology, the speed of internet network is improved and then the amount of log data is becoming 'Massive Data' or 'BigData'. In this trend, there are limits to analyze log data using the traditional database model RDBMS. In this paper, through our Method of Analyzing Firewall log data using MapReduce based on NoSQL, we have discovered that the introducing NoSQL data base model can more effectively analyze the massive log data than the traditional one. We have demonstrated execellent performance of the NoSQL by comparing the performance of data processing with existing RDBMS. Also the proposed method is evaluated by experiments that detect the three attack patterns and shown that it is highly effective.