• Title/Summary/Keyword: 빅데이터플랫폼

Search Result 483, Processing Time 0.03 seconds

Development of Information Technology Infrastructures through Construction of Big Data Platform for Road Driving Environment Analysis (도로 주행환경 분석을 위한 빅데이터 플랫폼 구축 정보기술 인프라 개발)

  • Jung, In-taek;Chong, Kyu-soo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.3
    • /
    • pp.669-678
    • /
    • 2018
  • This study developed information technology infrastructures for building a driving environment analysis platform using various big data, such as vehicle sensing data, public data, etc. First, a small platform server with a parallel structure for big data distribution processing was developed with H/W technology. Next, programs for big data collection/storage, processing/analysis, and information visualization were developed with S/W technology. The collection S/W was developed as a collection interface using Kafka, Flume, and Sqoop. The storage S/W was developed to be divided into a Hadoop distributed file system and Cassandra DB according to the utilization of data. Processing S/W was developed for spatial unit matching and time interval interpolation/aggregation of the collected data by applying the grid index method. An analysis S/W was developed as an analytical tool based on the Zeppelin notebook for the application and evaluation of a development algorithm. Finally, Information Visualization S/W was developed as a Web GIS engine program for providing various driving environment information and visualization. As a result of the performance evaluation, the number of executors, the optimal memory capacity, and number of cores for the development server were derived, and the computation performance was superior to that of the other cloud computing.

A Study on the Application of Macro Model in the Housing Market with Integrated Information Platform (주택시장의 통합정보 플랫폼과 연계한 거시 모형 적용성 방안 연구)

  • Jung, Hoi-Min;Lee, Sang-Hun;Moon, Sung-Min
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.17-18
    • /
    • 2019
  • 오픈플랫폼 기반 주택시장 분석 플랫폼은 Linux(CentOS) 서버를 운영체제로 주택 분야 빅데이터 수집/가공/분석/예측을 위하여 Hadoop 기반으로 구축한 플랫폼이다. 오픈소스 플랫폼을 기반으로 다양한 대규모 데이터를 분석하고, 미시/거시 모델을 적용하여 그 예측력을 검증하고자 한다. 본 연구에서는 기존 방식으로 분석하던 Windows 기반의 E-Views 거시 분석 모형을 오픈소스 분석 플랫폼을 구축하고 이와 연계하여 결과를 도출하는 방안을 제시하고자 한다.

Measuring Hadoop Optimality by Lorenz Curve (로렌츠 커브를 이용한 하둡 플랫폼의 최적화 지수)

  • Kim, Woo-Cheol;Baek, Changryong
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.2
    • /
    • pp.249-261
    • /
    • 2014
  • Ever increasing "Big data" can only be effectively processed by parallel computing. Parallel computing refers to a high performance computational method that achieves effectiveness by dividing a big query into smaller subtasks and aggregating results from subtasks to provide an output. However, it is well-known that parallel computing does not achieve scalability which means that performance is improved linearly by adding more computers because it requires a very careful assignment of tasks to each node and collecting results in a timely manner. Hadoop is one of the most successful platforms to attain scalability. In this paper, we propose a measurement for Hadoop optimization by utilizing a Lorenz curve which is a proxy for the inequality of hardware resources. Our proposed index takes into account the intrinsic overhead of Hadoop systems such as CPU, disk I/O and network. Therefore, it also indicates that a given Hadoop can be improved explicitly and in what capacity. Our proposed method is illustrated with experimental data and substantiated by Monte Carlo simulations.

Learning algorithms for big data logistic regression on RHIPE platform (RHIPE 플랫폼에서 빅데이터 로지스틱 회귀를 위한 학습 알고리즘)

  • Jung, Byung Ho;Lim, Dong Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.4
    • /
    • pp.911-923
    • /
    • 2016
  • Machine learning becomes increasingly important in the big data era. Logistic regression is a type of classification in machine leaning, and has been widely used in various fields, including medicine, economics, marketing, and social sciences. Rhipe that integrates R and Hadoop environment, has not been discussed by many researchers owing to the difficulty of its installation and MapReduce implementation. In this paper, we present the MapReduce implementation of Gradient Descent algorithm and Newton-Raphson algorithm for logistic regression using Rhipe. The Newton-Raphson algorithm does not require a learning rate, while Gradient Descent algorithm needs to manually pick a learning rate. We choose the learning rate by performing the mixed procedure of grid search and binary search for processing big data efficiently. In the performance study, our Newton-Raphson algorithm outpeforms Gradient Descent algorithm in all the tested data.

Design of Splunk Platform based Big Data Analysis System for Objectionable Information Detection (Splunk 플랫폼을 활용한 유해 정보 탐지를 위한 빅데이터 분석 시스템 설계)

  • Lee, Hyeop-Geon;Kim, Young-Woon;Kim, Ki-Young;Choi, Jong-Seok
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.1
    • /
    • pp.76-81
    • /
    • 2018
  • The Internet of Things (IoT), which is emerging as a future economic growth engine, has been actively introduced in areas close to our daily lives. However, there are still IoT security threats that need to be resolved. In particular, with the spread of smart homes and smart cities, an explosive amount of closed-circuit televisions (CCTVs) have been installed. The Internet protocol (IP) information and even port numbers assigned to CCTVs are open to the public via search engines of web portals or on social media platforms, such as Facebook and Twitter; even with simple tools these pieces of information can be easily hacked. For this reason, a big-data analytics system is needed, capable of supporting quick responses against data, that can potentially contain risk factors to security or illegal websites that may cause social problems, by assisting in analyzing data collected by search engines and social media platforms, frequently utilized by Internet users, as well as data on illegal websites.

Building an Analytical Platform of Big Data for Quality Inspection in the Dairy Industry: A Machine Learning Approach (유제품 산업의 품질검사를 위한 빅데이터 플랫폼 개발: 머신러닝 접근법)

  • Hwang, Hyunseok;Lee, Sangil;Kim, Sunghyun;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.125-140
    • /
    • 2018
  • As one of the processes in the manufacturing industry, quality inspection inspects the intermediate products or final products to separate the good-quality goods that meet the quality management standard and the defective goods that do not. The manual inspection of quality in a mass production system may result in low consistency and efficiency. Therefore, the quality inspection of mass-produced products involves automatic checking and classifying by the machines in many processes. Although there are many preceding studies on improving or optimizing the process using the data generated in the production process, there have been many constraints with regard to actual implementation due to the technical limitations of processing a large volume of data in real time. The recent research studies on big data have improved the data processing technology and enabled collecting, processing, and analyzing process data in real time. This paper aims to propose the process and details of applying big data for quality inspection and examine the applicability of the proposed method to the dairy industry. We review the previous studies and propose a big data analysis procedure that is applicable to the manufacturing sector. To assess the feasibility of the proposed method, we applied two methods to one of the quality inspection processes in the dairy industry: convolutional neural network and random forest. We collected, processed, and analyzed the images of caps and straws in real time, and then determined whether the products were defective or not. The result confirmed that there was a drastic increase in classification accuracy compared to the quality inspection performed in the past.

Simulation for the Decision-making Models of Supply Chain Inventory Management System (공급망 재고관리시스템의 의사결정모형을 위한 시뮬레이션)

  • Chen, Jinhui;Nam, Soo-tae;Jin, Chan-yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.159-160
    • /
    • 2021
  • From the simulation results, under the collaborative platform of big data based on coordination of the beer industry to mobilize the supply chain operation condition, supply chain direct logistics inventory are in a relatively stable value, and there is no zero inventory or even a serious lack of beer in the stock situations like traditional beer supply chain operation, which avoid the situation of demand information expansion caused by chain inventory levels report because of the serious lack of supply.

  • PDF

Utilizing Spatial Big Data for Land and Housing Sector (토지주택분야 정보 현황과 빅데이터 연계활용 방안)

  • Jeong, Yeun-Woo;Yu, Jong-Hun
    • Land and Housing Review
    • /
    • v.7 no.1
    • /
    • pp.19-29
    • /
    • 2016
  • This study proposes the big data policy and case studies in Korea and the application of land and housing of spatial big data to excavate the future business and to propose the spatial big data based application for the government policy in advance. As a result, at first, the policy and cases of big data in Korea were evaluated. Centered on the Government 3.0 Committee, the information from each department of government is being established with the big-data-based system, and the Ministry of Land, Infrastructure, and Transport is establishing the spatial big data system from 2013 to support application of big data through the platform of national spatial information and job creation. Second, based on the information system established and administrated by LH, the status of national territory information and the application of land and housing were evaluated. First of all, the information system is categorized mainly into the support of public ministration, statistical view, real estate information, on-line petition, and national policy support, and as a basic direction of major application, the national territory information (DB), demand of application (scope of work), and profit creation (business model) were regarded. After the settings of such basic direction, as a result of evaluating an approach in terms of work scope and work procedure, the four application fields were extracted: selection of candidate land for regional development business, administration and operation of rental house, settings of priority for land preservation, and settings of priority for urban generation. Third, to implement the application system of spatial big data in the four fields extracted, the required data and application and analytic procedures for each application field were proposed, and to implement the application solution of spatial big data, the improvement and future direction of evaluation required from LH were proposed.

KISTI-ML Platform: A Community-based Rapid AI Model Development Tool for Scientific Data (KISTI-ML 플랫폼: 과학기술 데이터를 위한 커뮤니티 기반 AI 모델 개발 도구)

  • Lee, Jeongcheol;Ahn, Sunil
    • Journal of Internet Computing and Services
    • /
    • v.20 no.6
    • /
    • pp.73-84
    • /
    • 2019
  • Machine learning as a service, the so-called MLaaS, has recently attracted much attention in almost all industries and research groups. The main reason for this is that you do not need network servers, storage, or even data scientists, except for the data itself, to build a productive service model. However, machine learning is often very difficult for most developers, especially in traditional science due to the lack of well-structured big data for scientific data. For experiment or application researchers, the results of an experiment are rarely shared with other researchers, so creating big data in specific research areas is also a big challenge. In this paper, we introduce the KISTI-ML platform, a community-based rapid AI model development for scientific data. It is a place where machine learning beginners use their own data to automatically generate code by providing a user-friendly online development environment. Users can share datasets and their Jupyter interactive notebooks among authorized community members, including know-how such as data preprocessing to extract features, hidden network design, and other engineering techniques.

Improvement of Information Service System for Smart Library Based on Bigdata Plateform (빅데이터 플랫폼 기반 스마트도서관 정보서비스시스템의 구현)

  • Min, Byoung-Won;Oh, Yong-Sun
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2013.05a
    • /
    • pp.263-264
    • /
    • 2013
  • 기존의 도서관 정보서비스는 도서관 업무담당자에 의한 1:n 방식의 온라인 지식서비스만을 강조하였다면 스마트 도서관시스템에서는 빅데이터를 통해 지식을 생성, 검증, 분류하여 지능형지식, 실감형지식, 맞춤형지식, 체험형지식 등을 제공할 수 있다. 또한 빅데이터를 활용한 다자간 콘텐츠 공유, 상호 의견 교환이 가능하며, 집단지성에 의해 구축되는 학습 콘텐츠 및 지식 베이스는 국가의 지식자원 경쟁력을 향상시킬 수 있으며, 차세대 이러닝 환경에서의 지능형 튜터링을 통해 창의적 인재육성, 공교육의 질적 향상, 사교육비 절감, 교육 기회 균등 배분, 지역 및 계층 간 위화감 해소 등 국가정책 목표 실현할 수 있다. 제안된 빅데이터 기반의 스마트도서관 정보서비스시스템에서는 멀티테넌트 환경에서 구현이 가능한 핵심요소들을 개발하였다. 그러므로 초기 투자비용이 거의 없고, 쉽고, 간편하며, 저비용 IT 서비스가 가능한 SaaS 기반의 소프트웨어 온-디멘드 방식의 서비스 모델로 시스템을 구현하였다. 또한 연결방식으로는 N고객:1인스턴스, 제공 프로그램은 동일한 코드 사용, 커스터마이징은 고객이 테넌트별 환경 설정을 통해서 직접 수정가능, 데이터는 테넌트별 자료를 공유해서 사용할 수 있으며 기존의 디지털도서관 시스템 서비스의 단점을 해결할 수 있도록 성능을 개선하였다.

  • PDF