• Title/Summary/Keyword: Big-Data Platform

Search Result 506, Processing Time 0.022 seconds

Development of Big-data Management Platform Considering Docker Based Real Time Data Connecting and Processing Environments (도커 기반의 실시간 데이터 연계 및 처리 환경을 고려한 빅데이터 관리 플랫폼 개발)

  • Kim, Dong Gil;Park, Yong-Soon;Chung, Tae-Yun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.4
    • /
    • pp.153-161
    • /
    • 2021
  • Real-time access is required to handle continuous and unstructured data and should be flexible in management under dynamic state. Platform can be built to allow data collection, storage, and processing from local-server or multi-server. Although the former centralize method is easy to control, it creates an overload problem because it proceeds all the processing in one unit, and the latter distributed method performs parallel processing, so it is fast to respond and can easily scale system capacity, but the design is complex. This paper provides data collection and processing on one platform to derive significant insights from various data held by an enterprise or agency in the latter manner, which is intuitively available on dashboards and utilizes Spark to improve distributed processing performance. All service utilize dockers to distribute and management. The data used in this study was 100% collected from Kafka, showing that when the file size is 4.4 gigabytes, the data processing speed in spark cluster mode is 2 minute 15 seconds, about 3 minutes 19 seconds faster than the local mode.

A Data-driven Approach for Computational Simulation: Trend, Requirement and Technology

  • Lee, Sunghee;Ahn, Sunil;Joo, Wonkyun;Yang, Myungseok;Yu, Eunji
    • Journal of Internet Computing and Services
    • /
    • v.19 no.1
    • /
    • pp.123-130
    • /
    • 2018
  • With the emergence of a new paradigm called Open Science and Big Data, the need for data sharing and collaboration is also emerging in the computational science field. This paper, we analyzed data-driven research cases for computational science by field; material design, bioinformatics, high energy physics. We also studied the characteristics of the computational science data and the data management issues. To manage computational science data effectively it is required to have data quality management, increased data reliability, flexibility to support a variety of data types, and tools for analysis and linkage to the computing infrastructure. In addition, we analyzed trends of platform technology for efficient sharing and management of computational science data. The main contribution of this paper is to review the various computational science data repositories and related platform technologies to analyze the characteristics of computational science data and the problems of data management, and to present design considerations for building a future computational science data platform.

A Study on Research Trends in Metaverse Platform Using Big Data Analysis (빅데이터 분석을 활용한 메타버스 플랫폼 연구 동향 분석)

  • Hong, Jin-Wook;Han, Jung-Wan
    • Journal of Digital Convergence
    • /
    • v.20 no.5
    • /
    • pp.627-635
    • /
    • 2022
  • As the non-face-to-face situation continues for a long time due to COVID-19, the underlying technologies of the 4th industrial revolution such as IOT, AR, VR, and big data are affecting the metaverse platform overall. Such changes in the external environment such as society and culture can affect the development of academics, and it is very important to systematically organize existing achievements in preparation for changes. The Korea Educational Research Information Service (RISS) collected data including the 'metaverse platform' in the keyword and used the text mining technique, one of the big data analysis. The collected data were analyzed for word cloud frequency, connection strength between keywords, and semantic network analysis to examine the trends of metaverse platform research. As a result of the study, keywords appeared in the order of 'use', 'digital', 'technology', and 'education' in word cloud analysis. As a result of analyzing the connection strength (N-gram) between keywords, 'Edue→Tech' showed the highest connection strength and a total of three clusters of word chain clusters were derived. Detailed research areas were classified into five areas, including 'digital technology'. Considering the analysis results comprehensively, It seems necessary to discover and discuss more active research topics from the long-term perspective of developing a metaverse platform.

Development of Information Technology Infrastructures through Construction of Big Data Platform for Road Driving Environment Analysis (도로 주행환경 분석을 위한 빅데이터 플랫폼 구축 정보기술 인프라 개발)

  • Jung, In-taek;Chong, Kyu-soo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.3
    • /
    • pp.669-678
    • /
    • 2018
  • This study developed information technology infrastructures for building a driving environment analysis platform using various big data, such as vehicle sensing data, public data, etc. First, a small platform server with a parallel structure for big data distribution processing was developed with H/W technology. Next, programs for big data collection/storage, processing/analysis, and information visualization were developed with S/W technology. The collection S/W was developed as a collection interface using Kafka, Flume, and Sqoop. The storage S/W was developed to be divided into a Hadoop distributed file system and Cassandra DB according to the utilization of data. Processing S/W was developed for spatial unit matching and time interval interpolation/aggregation of the collected data by applying the grid index method. An analysis S/W was developed as an analytical tool based on the Zeppelin notebook for the application and evaluation of a development algorithm. Finally, Information Visualization S/W was developed as a Web GIS engine program for providing various driving environment information and visualization. As a result of the performance evaluation, the number of executors, the optimal memory capacity, and number of cores for the development server were derived, and the computation performance was superior to that of the other cloud computing.

A GPU-enabled Face Detection System in the Hadoop Platform Considering Big Data for Images (이미지 빅데이터를 고려한 하둡 플랫폼 환경에서 GPU 기반의 얼굴 검출 시스템)

  • Bae, Yuseok;Park, Jongyoul
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.1
    • /
    • pp.20-25
    • /
    • 2016
  • With the advent of the era of digital big data, the Hadoop platform has become widely used in various fields. However, the Hadoop MapReduce framework suffers from problems related to the increase of the name node's main memory and map tasks for the processing of large number of small files. In addition, a method for running C++-based tasks in the MapReduce framework is required in order to conjugate GPUs supporting hardware-based data parallelism in the MapReduce framework. Therefore, in this paper, we present a face detection system that generates a sequence file for images to process big data for images in the Hadoop platform. The system also deals with tasks for GPU-based face detection in the MapReduce framework using Hadoop Pipes. We demonstrate a performance increase of around 6.8-fold as compared to a single CPU process.

Development of a Platform Using Big Data-Based Artificial Intelligence to Predict New Demand of Shipbuilding (선박 신수요 예측을 위한 빅데이터 기반 인공지능 알고리즘을 활용한 플랫폼 개발)

  • Lee, Sangwon;Jung, Inhwan
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.1
    • /
    • pp.171-178
    • /
    • 2019
  • Korea's shipbuilding industry is in a critical condition due to changes in the domestic and international environment. To overcome this crisis, preemptive development of products and technologies through prediction of new demand for ships is necessary. The goal of this research is to develop an artificial intelligence algorithm based on ship big data in order to predict new demand for ships. We intend to develop a big data analytics platform specialized in predicting ship demand and to utilize the forecast results of new ship demand through data analysis for planning/development of new products. By doing so, the development of sustainable new business models for equipment and equipment manufacturers will create new growth engines for shipyard and shipbuilders. Furthermore, it is expected that shipbuilders will be able to create business cases based on measurable performance, plan market-oriented products and services, and continuously achieve innovation that has high market destructive power.

An Assessment System for Evaluating Big Data Capability Based on a Reference Model (빅데이터 역량 평가를 위한 참조모델 및 수준진단시스템 개발)

  • Cheon, Min-Kyeong;Baek, Dong-Hyun
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.2
    • /
    • pp.54-63
    • /
    • 2016
  • As technology has developed and cost for data processing has reduced, big data market has grown bigger. Developed countries such as the United States have constantly invested in big data industry and achieved some remarkable results like improving advertisement effects and getting patents for customer service. Every company aims to achieve long-term survival and profit maximization, but it needs to establish a good strategy, considering current industrial conditions so that it can accomplish its goal in big data industry. However, since domestic big data industry is at its initial stage, local companies lack systematic method to establish competitive strategy. Therefore, this research aims to help local companies diagnose their big data capabilities through a reference model and big data capability assessment system. Big data reference model consists of five maturity levels such as Ad hoc, Repeatable, Defined, Managed and Optimizing and five key dimensions such as Organization, Resources, Infrastructure, People, and Analytics. Big data assessment system is planned based on the reference model's key factors. In the Organization area, there are 4 key diagnosis factors, big data leadership, big data strategy, analytical culture and data governance. In Resource area, there are 3 factors, data management, data integrity and data security/privacy. In Infrastructure area, there are 2 factors, big data platform and data management technology. In People area, there are 3 factors, training, big data skills and business-IT alignment. In Analytics area, there are 2 factors, data analysis and data visualization. These reference model and assessment system would be a useful guideline for local companies.

Design and Implementation of AI Recommendation Platform for Commercial Services

  • Jong-Eon Lee
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.202-207
    • /
    • 2023
  • In this paper, we discuss the design and implementation of a recommendation platform actually built in the field. We survey deep learning-based recommendation models that are effective in reflecting individual user characteristics. The recently proposed RNN-based sequential recommendation models reflect individual user characteristics well. The recommendation platform we proposed has an architecture that can collect, store, and process big data from a company's commercial services. Our recommendation platform provides service providers with intuitive tools to evaluate and apply timely optimized recommendation models. In the model evaluation we performed, RNN-based sequential recommendation models showed high scores.

A Study on b-Traffic Service Platform based on Open data Infrastructure (공공데이터 인프라기반 b-Traffic 서비스 플랫폼 연구)

  • Son, Seok-Hyun;Song, Seok-Hyun;Shin, Hyo-Seop
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.07a
    • /
    • pp.117-118
    • /
    • 2014
  • 최근 공공기관의 공공데이터 제공이 활성화 되고 있으며, 이를 활용한 응용서비스에 대한 요구도 증가하고 있는 추세이다. 현재 교통정보예측 플랫폼은 실시간 교통정보 또는 과거 교통정보이력을 분석하여 미래의 교통량이나 도착시간정보를 제공하고 있으나 날씨, 사고 등과 같은 미래 교통정보에 즉각적인 영향을 줄 수 있는 요소를 배제하고 있어 높은 신뢰도를 확보하기 어렵다. 본 논문에서는 교통정보예측에 영향을 주는 요소인 기상, 사고, 교통정보와 같은 공공데이터를 효율적으로 수집 저장 처리할 수 있는 저장방식 및 신뢰도 높은 교통정보를 예측할 수 있는 예측기술이 포함된 b-Traffic 서비스 플랫폼을 제시한다.

  • PDF

Design and Evaluation Security Control Iconology for Big Data Processing (빅데이터 처리를 위한 보안관제 시각화 구현과 평가)

  • Jeon, Sang June;Yun, Seong Yul;Kim, Jeong Ho
    • Journal of Platform Technology
    • /
    • v.8 no.4
    • /
    • pp.38-46
    • /
    • 2020
  • This study describes how to build a security control system using an open source big data solution so that private companies can build an overall security control infrastructure. In particular, the infrastructure was built using the Elastic Stack, one of the free open source big data analysis solutions, as a way to shorten the cost and development time when building a security control system. A comparative experiment was conducted. In addition, as a result of comparing and analyzing the functions, convenience, service and technical support of the two solution, it was found that the Elastic Stack has advantages in the security control of Big Data in terms of community and open solution. Using the Elastic Stack, security logs were collected, analyzed, and visualized step by step to create a dashboard, input large logs, and measure the search speed. Through this, we discovered the possibility of the Elastic Stack as a big data analysis solution that could replace Splunk.

  • PDF