• Title/Summary/Keyword: 빅데이터 분석 플랫폼

Search Result 344, Processing Time 0.032 seconds

Performance evaluation and prediction for number of slave nodes in Spark (스파크 기반 분산 환경에서 슬레이브 노드의 개수에 따른 성능 분석과 예측)

  • Bak, Bongwoo;Myung, Rohyoung;Chung, KwangSik;Yu, Heonchang;Choi, Sukyong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.94-96
    • /
    • 2017
  • 최근 빅 데이터를 이용한 시스템들이 여러 분야에서 활발히 이용되기 시작하면서 대표적인 빅 데이터 저장 및 처리 플랫폼인 하둡(Hadoop)의 기술적 단점을 보완할 수 있는 분산 시스템 플랫폼 스파크(Apache Spark)가 등장하였다. 본 플랫폼을 바탕으로 슬레이브 노드들에게 작업을 분산하여 대용량 연산을 수행한다. 하지만 요구하는 성능을 내기 위해 어느 정도 규모의 슬레이브 노드가 필요한지, 각각의 컴퓨팅 능력은 얼마나 필요한지를 예측하는데 어려움이 있다. 본 논문에서는 스파크에서 원하는 성능을 내기 위해 어떤 조건을 충족해야 하는지, 현재 환경에서는 어느 정도 성능을 낼 수 있는지 실험을 통해 모델을 만들어 예측한다.

A Study on 5 Platform Technology Trends for 4th Industrial Revolution (4차 산업혁명 관련 5대 플랫폼 기술의 연구 수준 분석)

  • Chun, Ki Woo;Kim, Haedo;Park, Kwisun;Lee, Keonsoo
    • Proceedings of the Korea Technology Innovation Society Conference
    • /
    • 2017.11a
    • /
    • pp.1305-1319
    • /
    • 2017
  • 정보통신(ICT)혁명에 이어 사람이 기존에 수행하던 일을 사람의 도움 없이 자동화시킬 수 있는 기술들이 4차 산업혁명의 핵심 기술로 등장하고 있다. 4차 산업혁명이 현재 산업의 지형도와 경제 사회 패러다임의 변화를 촉진하고 있는 것은 주지의 사실이며, 이에 따라 4차 산업혁명에 주요 추동력(driving force)을 제공하고 있는 핵심 플랫폼 기술 5개(인공지능, 빅데이터, 사물인터넷, 클라우드 컴퓨팅, 3D 프린팅)를 선별하여 글로벌 연구동향과 한국의 연구 수준을 파악하였다. 5개 기술에 대해 Elsevier사의 SCOPUS DB를 기반으로 최근 5년 간의 학술연구논문의 출판 현황을 분석하는 한편, 한국연구재단에서 지원된 과제 현황을 조사하였다. 이를 바탕으로 5개 기술별로 국제적 연구 수준과 주요 리딩 연구기관을 파악하였으며, 4차 산업혁명에 대응하는 연구개발의 정책적 시사점을 도출하고자 하였다. 5개 기술에 대해 미국과 중국이 선도하고 있으며, 영국, 독일, 프랑스, 이탈리아 등의 유럽 국가와 일본, 인도, 한국 등이 추격 그룹을 형성하고 있었다. 한국은 HW 대비 SW분야의 연구 경쟁력이 상대적으로 취약하였으며, 전반적으로 연구 수준의 질적 향상이 필요함을 확인하였다.

  • PDF

Design of Spark SQL Based Framework for Advanced Analytics (Spark SQL 기반 고도 분석 지원 프레임워크 설계)

  • Chung, Jaehwa
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.10
    • /
    • pp.477-482
    • /
    • 2016
  • As being the advanced analytics indispensable on big data for agile decision-making and tactical planning in enterprises, distributed processing platforms, such as Hadoop and Spark which distribute and handle the large volume of data on multiple nodes, receive great attention in the field. In Spark platform stack, Spark SQL unveiled recently to make Spark able to support distributed processing framework based on SQL. However, Spark SQL cannot effectively handle advanced analytics that involves machine learning and graph processing in terms of iterative tasks and task allocations. Motivated by these issues, this paper proposes the design of SQL-based big data optimal processing engine and processing framework to support advanced analytics in Spark environments. Big data optimal processing engines copes with complex SQL queries that involves multiple parameters and join, aggregation and sorting operations in distributed/parallel manner and the proposing framework optimizes machine learning process in terms of relational operations.

Evolution of ICT Ecosystem and Mobile Telcos' Counterstrategies (ICT 생태계 변화에 따른 국내 이동통신 사업자의 대응 전략에 대한 연구)

  • Kim, Dong Ju;Kang, Mincheol
    • Journal of Information Technology and Architecture
    • /
    • v.10 no.2
    • /
    • pp.197-209
    • /
    • 2013
  • This study analyzes the nature of consumers and smart phones as well as its limitations that domestic mobile communication companies confront. According to the analysis results, emerging technologies such as 5G communication, pervasive computing, augmented reality, and big data seem to have significant effect on the ICT ecosystem in the near future. Based on the results, this study suggests four counterstrategies for domestic mobile communication companies: big data strategy, preparation of things acting as a main communication agent, new service platform development, and 'total life care service provider' strategy.

One-stop Platform for Verification of ICT-based environmental monitoring sensor data (ICT 기반 환경모니터링 센서 데이터 검증을 위한 원스탑 플랫폼)

  • Chae, Minah;Cho, Jae Hyuk
    • Journal of Platform Technology
    • /
    • v.9 no.1
    • /
    • pp.32-39
    • /
    • 2021
  • Existing environmental measuring devices mainly focus on electromagnetic wave and eco-friendly product certification and durability test, and sensor reliability verification and verification of measurement data are conducted mainly through sensor performance evaluation through type approval and registration, acceptance test, initial calibration, and periodic test. This platform has established an ICT-based environmental monitoring sensor reliability verification system that supports not only performance evaluation for each target sensor, but also a verification system for sensor data reliability. A sensor board to collect sensor data for environmental information was produced, and a sensor and data reliability evaluation and verification service system was standardized. In addition, to evaluate and verify the reliability of sensor data based on ICT, a sensor data platform monitoring prototype using LoRa communication was produced, and the test was conducted in smart cities. To analyze the data received through the system, an optimization algorithm was developed using machine learning. Through this, a sensor big data analysis system is established for reliability verification, and the foundation for an integrated evaluation and verification system is provide.

Big data distributed processing system using RHadoop (RHadoop을 이용한 빅데이터 분산처리 시스템)

  • Shin, Ji Eun;Jung, Byung Ho;Lim, Dong Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.5
    • /
    • pp.1155-1166
    • /
    • 2015
  • It is almost impossible to store or analyze big data increasing exponentially with traditional technologies, so Hadoop is a new technology to make that possible. In recent R is using as an engine for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with various data sizes of actual data and simulated data. Experimental results showed our RHadoop system was faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and biglm packages available on bigmemory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Design and Evaluation Security Control Iconology for Big Data Processing (빅데이터 처리를 위한 보안관제 시각화 구현과 평가)

  • Jeon, Sang June;Yun, Seong Yul;Kim, Jeong Ho
    • Journal of Platform Technology
    • /
    • v.8 no.4
    • /
    • pp.38-46
    • /
    • 2020
  • This study describes how to build a security control system using an open source big data solution so that private companies can build an overall security control infrastructure. In particular, the infrastructure was built using the Elastic Stack, one of the free open source big data analysis solutions, as a way to shorten the cost and development time when building a security control system. A comparative experiment was conducted. In addition, as a result of comparing and analyzing the functions, convenience, service and technical support of the two solution, it was found that the Elastic Stack has advantages in the security control of Big Data in terms of community and open solution. Using the Elastic Stack, security logs were collected, analyzed, and visualized step by step to create a dashboard, input large logs, and measure the search speed. Through this, we discovered the possibility of the Elastic Stack as a big data analysis solution that could replace Splunk.

  • PDF

Customized marketing optimization for Big Data in SNS Environment (SNS 환경에서 빅데이터 활용을 위한 고객맞춤 마케팅 최적화)

  • Song, Jung-Ho;Park, Seok-Cheon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.1120-1123
    • /
    • 2013
  • 최근 데이터의 범람과 더불어 빅데이터 시대가 도래 하면서 SNS 라는 새로운 플랫폼을 마케팅에 활용하고자 하는 기업들이 늘어나고 있다. 기업들은 이러한 SNS 상의 데이터를 분석하고 이를 공개 API 를 통해 마케팅에서 활용할 수 있다. 하지만 SNS 업체들은 과도한 트래픽 유발 및 보안상의 이유로 공개 API 의 사용을 제한하고 있다. 따라서 제한된 사용 횟수 안에서 효과적으로 공개 API 를 사용할 수 있는 고객맞춤 최적화가 필요하다. 기존의 멀티캐스팅을 이용하면 이러한 고객맞춤 최적화가 가능하지만 SNS 의 특성을 반영한 것이 아니기 때문에 SNS 마케팅에서 활용하는데에는 한계가 있을 수 밖에 없다. 본 논문에서는 이러한 멀티캐스팅을 이용한 고객맞춤 최적화의 한계를 보완하고 SNS 의 특성을 보다 잘 활용할 수 있는 새로운 SNS 마케팅을 위한 고객맞춤 최적화를 제시한다.

Artificial Intelligence-based Security Control Construction and Countermeasures (인공지능기반 보안관제 구축 및 대응 방안)

  • Hong, Jun-Hyeok;Lee, Byoung Yup
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.1
    • /
    • pp.531-540
    • /
    • 2021
  • As cyber attacks and crimes increase exponentially and hacking attacks become more intelligent and advanced, hacking attack methods and routes are evolving unpredictably and in real time. In order to reinforce the enemy's responsiveness, this study aims to propose a method for developing an artificial intelligence-based security control platform by building a next-generation security system using artificial intelligence to respond by self-learning, monitoring abnormal signs and blocking attacks.The artificial intelligence-based security control platform should be developed as the basis for data collection, data analysis, next-generation security system operation, and security system management. Big data base and control system, data collection step through external threat information, data analysis step of pre-processing and formalizing the collected data to perform positive/false detection and abnormal behavior analysis through deep learning-based algorithm, and analyzed data Through the operation of a security system of prevention, control, response, analysis, and organic circulation structure, the next generation security system to increase the scope and speed of handling new threats and to reinforce the identification of normal and abnormal behaviors, and management of the security threat response system, Harmful IP management, detection policy management, security business legal system management. Through this, we are trying to find a way to comprehensively analyze vast amounts of data and to respond preemptively in a short time.

Implementation of a pet product recommendation system using big data (빅 데이터를 활용한 애완동물 상품 추천 시스템 구현)

  • Kim, Sam-Taek
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.11
    • /
    • pp.19-24
    • /
    • 2020
  • Recently, due to the rapid increase of pets, there is a need for an integrated pet-related personalized product recommendation service such as feed recommendation using a health status check of pets and various collected data. This paper implements a product recommendation system that can perform various personalized services such as collection, pre-processing, analysis, and management of pet-related data using big data. First, the sensor information worn by pets, customer purchase patterns, and SNS information are collected and stored in a database, and a platform capable of customized personalized recommendation services such as feed production and pet health management is implemented using statistical analysis. The platform can provide information to customers by outputting similarity product information about the product to be analyzed and information, and finally outputting the result of recommendation analysis.