• Title/Summary/Keyword: Distributed Data pipeline

Search Result 23, Processing Time 0.043 seconds

Implementation of AIoT Edge Cluster System via Distributed Deep Learning Pipeline

  • Jeon, Sung-Ho;Lee, Cheol-Gyu;Lee, Jae-Deok;Kim, Bo-Seok;Kim, Joo-Man
    • International journal of advanced smart convergence
    • /
    • v.10 no.4
    • /
    • pp.278-288
    • /
    • 2021
  • Recently, IoT systems are cloud-based, so that continuous and large amounts of data collected from sensor nodes are processed in the data server through the cloud. However, in the centralized configuration of large-scale cloud computing, computational processing must be performed at a physical location where data collection and processing take place, and the need for edge computers to reduce the network load of the cloud system is gradually expanding. In this paper, a cluster system consisting of 6 inexpensive Raspberry Pi boards was constructed to perform fast data processing. And we propose "Kubernetes cluster system(KCS)" for processing large data collection and analysis by model distribution and data pipeline method. To compare the performance of this study, an ensemble model of deep learning was built, and the accuracy, processing performance, and processing time through the proposed KCS system and model distribution were compared and analyzed. As a result, the ensemble model was excellent in accuracy, but the KCS implemented as a data pipeline proved to be superior in processing speed..

SYSTEM ANALYSIS OF PIPELINE SOFTWARE - A CASE STUDY OF THE IMAGING SURVEY AT ESO

  • Kim, Young-Soo
    • Journal of Astronomy and Space Sciences
    • /
    • v.20 no.4
    • /
    • pp.403-416
    • /
    • 2003
  • There are common features, in both imaging surveys and image processing, between astronomical observations and remote sensing. Handling large amounts of data, in an easy and fast way, has become a common issue. Implementing pipeline software can be a solution to the problem, one which allows the processing of various kinds of data automatically. As a case study, the development of pipeline software for the EIS (European Southern Observatory Imaging Survey) is introduced. The EIS team has been conducting a sky survey to provide candidate targets to the 250 VLTs (Very Large Telescopes) observations. The survey data have been processed in a sequence of five major data corrections and reductions, i.e. preprocessing, flat fielding, photometric and astrometric corrections, source extraction, and coaddition. The processed data are eventually distributed to the users. In order to provide automatic processing of the vast volume of observed data, pipeline software has been developed. Because of the complexity of objects and different characteristic of each process, it was necessary to analyze the whole works of the EIS survey program. The overall tasks of the EIS are identified, and the scheme of the EIS pipeline software is defined. The system structure and the processes are presented, and in-depth flow charts are analyzed. During the analyses, it was revealed that handling the data flow and managing the database are important for the data processing. These analyses may also be applied to many other fields which require image processing.

High Speed 2D Discrete Cosine Transform Processor

  • Kim, Ji-Eun;Hae Kyung SEONG;Kang Hyeon RHEE
    • Proceedings of the IEEK Conference
    • /
    • 2002.07c
    • /
    • pp.1823-1826
    • /
    • 2002
  • On modern computer culture, the high quality data is required in multimedia systems. So, the technology of data compression fur data transmission is necessary now. This paper presents the pipeline architecture for the low and column address generator of 2D DCT/IDCT (Discrete Cosine Transform/Inverse Discrete Cosine Transform. In the proposed architecture, the area of hardware is reduced by using the DA (distributed arithmetic) method and applies the concepts of pipeline to the parallel architecture. As a result the designed pipeline of the low and column address generator for 2D DCT/IDCT architecture is implemented with an efficiency and high speed compared with the non-pipeline architecture.

  • PDF

Scalable Big Data Pipeline for Video Stream Analytics Over Commodity Hardware

  • Ayub, Umer;Ahsan, Syed M.;Qureshi, Shavez M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.4
    • /
    • pp.1146-1165
    • /
    • 2022
  • A huge amount of data in the form of videos and images is being produced owning to advancements in sensor technology. Use of low performance commodity hardware coupled with resource heavy image processing and analyzing approaches to infer and extract actionable insights from this data poses a bottleneck for timely decision making. Current approach of GPU assisted and cloud-based architecture video analysis techniques give significant performance gain, but its usage is constrained by financial considerations and extremely complex architecture level details. In this paper we propose a data pipeline system that uses open-source tools such as Apache Spark, Kafka and OpenCV running over commodity hardware for video stream processing and image processing in a distributed environment. Experimental results show that our proposed approach eliminates the need of GPU based hardware and cloud computing infrastructure to achieve efficient video steam processing for face detection with increased throughput, scalability and better performance.

Design on Pipeline Architecture for the Low and Column Address Generator of 2D DCT/IDCT (2D DCT/IDCT의 행, 열 주소생성기를 위한 파이프라인 구조 설계)

  • 노진수;박종태;문규성;성해경;이강현
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2003.05b
    • /
    • pp.14-18
    • /
    • 2003
  • This paper presents the pipeline architecture for the low and column address generator of 2D DCT/IDCT(Discrete Cosine Transform/Inverse Discrete Cosine Transform). For the real time process of image data, it is required that high speed operation and small size hardware In the proposed architecture, the area of hardware is reduced by using the DA(distributed arithmetic) method and applying the concepts of pipeline on the parallel architecture. As a results, the designed pipeline of the low and column address generator for 2D DCT/IDCT architecture is implemented with an efficiency and high speed compared as the non-pipeline architecture. And the operation speed is improved about 50% up. The design for the proposed pipeline architecture of DCT/IDCT is coded using VHDL.

  • PDF

Pipeline-MapReduce Model for Processing Large Data Sets in Distributed Systems (대용량 분산 데이터 처리를 위한 Pipeline-MapReduce 모델)

  • Kim, Sun Jo;Kim, Taehyoung;Eom, Young Ik
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.121-122
    • /
    • 2009
  • 인터넷 상에서 정보량이 급격히 증가함에 따라 ISP들은 데이터를 효과적으로 처리하고 분석하기 위한 방법을 연구하고 있다. 대표적으로 Google에서는 대용량의 분산 데이터 처리 기법인 MapReduce 모델을 개발하였다. 본 논문에서는 기존 MapReduce 모델에 Pipeline 방식을 적용하여 성능을 개선한 Pipeline-MapReduce 기법을 제안한다. 그리고 실험을 통해 제안 기법이 기존 기법에 비해 빠른 처리 결과를 나타냄을 보여준다.

Distributed In-Memory Caching Method for ML Workload in Kubernetes (쿠버네티스에서 ML 워크로드를 위한 분산 인-메모리 캐싱 방법)

  • Dong-Hyeon Youn;Seokil Song
    • Journal of Platform Technology
    • /
    • v.11 no.4
    • /
    • pp.71-79
    • /
    • 2023
  • In this paper, we analyze the characteristics of machine learning workloads and, based on them, propose a distributed in-memory caching technique to improve the performance of machine learning workloads. The core of machine learning workload is model training, and model training is a computationally intensive task. Performing machine learning workloads in a Kubernetes-based cloud environment in which the computing framework and storage are separated can effectively allocate resources, but delays can occur because IO must be performed through network communication. In this paper, we propose a distributed in-memory caching technique to improve the performance of machine learning workloads performed in such an environment. In particular, we propose a new method of precaching data required for machine learning workloads into the distributed in-memory cache by considering Kubflow pipelines, a Kubernetes-based machine learning pipeline management tool.

  • PDF

Design of a Pipeline Processor for the Automated ECG Diagnosis in Real Time (실시간 심전도 자동진단을 위한 파이프라인 프로세서의 설계)

  • 이경중;윤형로;이명호
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.8
    • /
    • pp.1217-1226
    • /
    • 1989
  • This paper describes a design of hardware system for real time automatic diagnosis of ECG arrhythmia based on pipeline processor consisting of three microcomputer. ECG data is acquisited by 12 bit A/D converter with hardware QRS triggered detector. Four diagnostic parameters-heart rate, morpholigy, axis, and ST segment-are used for the classification and the diagnosis of arrhythmia. The functions of the main CPU were distributed and processed with three microcomputers. Therefore the effective data process and the real time process using microcomputer can be obtained. The interconnection structure consisting of two common memory unit is designed to decrease the delay time caused by data transfer between processors and be which the delay time can be taken 1% of one clock period.

  • PDF

Design of Pipeline Processor for ECG Feature Extraction (ECG 특징추출을 위한 파이프라인 프로세서의 설계)

  • 이경중;윤형로
    • Journal of Biomedical Engineering Research
    • /
    • v.9 no.1
    • /
    • pp.79-86
    • /
    • 1988
  • This paper describes the design of a hardware systenl for ECG feature extraction based on pipeline processor consistinsf of three microcomputers. ECG data is acquisited by 12 bit A/D converter with hardware QRS triggered detector. Four diagnostic parameters parameters-heart rate, morPhology, axis, and 57 segment-are used for the classification and the diagnosis of arrhythmia. The functions of the main CPU were distributed and processed with three microcomputers. Therefore the effective data process and the real time process using microcomputer can be obtained. The interconnection structure consisting of two common memory units is designed to decrease the delay time caused by data transfer between processors and designed by which the delay time can be taken Loye of one clock period.

  • PDF

A design of pipeline processor for real time ECG process (실시간 심전도 처리를 위한 파이프라인 프로세서의 설계)

  • Lee, Kyoung-Joong;Lee, Yoon-Sun;Yoon, Hyoung-Ro;Lee, Myoung-Ho
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.731-733
    • /
    • 1988
  • This paper describes a design of hardware system for real time automatic diagnosis of ECG arrhythmia based on pipeline processor consisting of the three microcomputer. ECG data is acquisited by 12 bit A/D converter with hardware QRS triggered detector. Four diagnostic parameters - heart rate, morphology, axis, and ST segment - are used for the classification and the diagnosis of arrhythmia. The functions of the main CPU were distributed and processed with three microcomputers. There-fore the effective data process and the real time process using microcomputer can be obtained. The interconnection structure consisting of two common memory units is designed to decrease the delay time caused by data transfer between processors and by which the delay time can be taken 1 % of one clock period.

  • PDF