• Title/Summary/Keyword: 이기종 컴퓨팅

Search Result 140, Processing Time 0.043 seconds

Heterogeneous Multi-Core Processor and Software Technology Trend for Embedded Devices (임베디드 기기를 위한 이기종 멀티코어 프로세서 및 소프트웨어 기술 동향)

  • Na, G.J.;Baek, W.K.;Jung, Y.J.
    • Electronics and Telecommunications Trends
    • /
    • v.28 no.2
    • /
    • pp.1-10
    • /
    • 2013
  • 1980년대와 1990년대가 서버와 데스크톱 중심 컴퓨팅의 시대였다고 한다면 2000년대 들어 모바일 분야를 포함하는 임베디드 프로세서 시장이 급격히 확장되며 임베디드 중심 시대로 산업구조가 재편되고 있다. 그리고, 2010년대에는 임베디드 프로세서 시장이 더욱 확대되고 기술도 더불어 발전되고 있는데, 최근 기술을 주도하고 있는 뜨거운 용어 중의 하나가 이기종 멀티코어 컴퓨팅이라 할 수 있다. 시장이 요구하는 고성능 컴퓨팅을 수용하고 임베디드 기기의 특성상 저전력을 실현해야 하는 현실적 문제를 해결하기 위한 이기종 멀티코어 하드웨어가 임베디드 기기에도 적용을 앞다투고 있는 상황이며, 적절한 응용 콘텐츠에 맞춰 이기종 멀티코어 하드웨어를 활용하기 위한 소프트웨어에 대한 관심과 발전도 발 맞춰 진행되고 있다. 이에 본고에서는 임베디드 기기 분야에 한정하여 이기종 멀티코어 하드웨어와 소프트웨어의 기술 동향을 살펴보고자 한다.

  • PDF

Efficient Workload Distribution of Photomosaic Using OpenCL into a Heterogeneous Computing Environment (이기종 컴퓨팅 환경에서 OpenCL을 사용한 포토모자이크 응용의 효율적인 작업부하 분배)

  • Kim, Heegon;Sa, Jaewon;Choi, Dongwhee;Kim, Haelyeon;Lee, Sungju;Chung, Yongwha;Park, Daihee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.8
    • /
    • pp.245-252
    • /
    • 2015
  • Recently, parallel processing methods with accelerator have been introduced into a high performance computing and a mobile computing. The photomosaic application can be parallelized by using inherent data parallelism and accelerator. In this paper, we propose a way to distribute the workload of the photomosaic application into a CPU and GPU heterogeneous computing environment. That is, the photomosaic application is parallelized using both CPU and GPU resource with the asynchronous mode of OpenCL, and then the optimal workload distribution rate is estimated by measuring the execution time with CPU-only and GPU-only distribution rates. The proposed approach is simple but very effective, and can be applied to parallelize other applications on a CPU and GPU heterogeneous computing environment. Based on the experimental results, we confirm that the performance is improved by 141% into a heterogeneous computing environment with the optimal workload distribution compared with using GPU-only method.

Efficient Task Distribution of Pig Monitoring Application using OpenCL (OpenCL을 사용한 돈사 감시 응용의 효율적인 태스크 분배)

  • Kim, J.;Choi, Y.;Kim, J.;Chung, Y.;Chung, Y.;Park, D.;Kim, H.
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.54-57
    • /
    • 2017
  • 돈사 감시 응용은 내재된 데이터 병렬성을 활용하고 성능가속기를 사용하여 병렬처리가 가능하다. 본 논문에서는 multicore-CPU와 manycore-GPU로 구성된 이기종 컴퓨팅 환경에서 돈사 감시 응용 수행 시 태스크 분배 방법을 제안한다. 즉, 각 태스크별로 OpenCL로 작성된 병렬 프로그램을 deviceCPU와 deviceGPU 각각에서 수행시켜 측정된 수행시간을 기준으로 가장 적합한 처리기를 결정한다. 제안 방법은 간단하지만 매우 효과적이고, CPU와 GPU로 구성된 이기종 컴퓨팅 플랫폼에서 다른 응용을 병렬화하는데에도 적용될 수 있다. 실험 결과, 상이한 이기종 컴퓨팅 플랫폼에서 최적의 태스크 분배로 수행 한 경우가 전체 태스크들을 deviceGPU에서 수행한 방법에 비교하여 각각 2배, 11배 성능 개선이 되었음을 확인하였다.

Parallel Processing Method on CPU for Image Processing on Mobile Heterogeneous Computing System (모바일 이기종 컴퓨팅 시스템에서 영상처리 고속화를 위한 CPU측 병렬처리 방법)

  • Beak, Aram;Choi, Haechul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2015.07a
    • /
    • pp.181-182
    • /
    • 2015
  • 모바일 기기의 보급률과 성능이 급속도로 성장하면서 모바일 기기에서의 비디오 소비 또한 크게 증가하였다. 하지만, 전력과 공간을 줄이기 위해 설계된 모바일 플랫폼은 데스크톱 플랫폼과 비교하여 성능의 한계가 존재한다. 따라서 대용량 비디오 처리를 위해 SIMD 아키텍쳐를 이용하는 임베디드 GPU를 활용하여 이와 같은 한계를 극복하기 위한 고속화 연구가 많이 진행되고 있다. 저장된 데이터를 활용하는 영상처리는 GPU 뿐만 아니라 CPU가 반드시 함께 이용되어야 하며, 모바일 환경에서의 이기종 컴퓨팅 시스템은 프로세서 사이의 낮은 전송속도와 이로 인한 대기시간, 모바일 운영체제가 지원하는 데이터 형태의 필수적인 사용 등의 구조적 단점이 존재한다. 본 논문에서는 임베디드 GPU를 활용한 영상처리 고속화를 위해 임베디드 CPU측에서 병렬처리를 이용하여 앞서 설명한 단점들을 극복하고 실험결과로 모바일 이기종 컴퓨팅 구조에서 임베디드 CPU 활용이 전체적인 연산 효율을 증가시키는 결과를 보였다.

  • PDF

NAAL: Software for controlling heterogeneous IoT devices based on neuromorphic architecture abstraction (NAAL: 뉴로모픽 아키텍처 추상화 기반 이기종 IoT 기기 제어용 소프트웨어)

  • Cho, Jinsung;Kim, Bongjae
    • Smart Media Journal
    • /
    • v.11 no.3
    • /
    • pp.18-25
    • /
    • 2022
  • Neuromorphic computing generally shows significantly better power, area, and speed performance than neural network computation using CPU and GPU. These characteristics are suitable for resource-constrained IoT environments where energy consumption is important. However, there is a problem in that it is necessary to modify the source code for environment setting and application operation according to heterogeneous IoT devices that support neuromorphic computing. To solve these problems, NAAL was proposed and implemented in this paper. NAAL provides functions necessary for IoT device control and neuromorphic architecture abstraction and inference model operation in various heterogeneous IoT device environments based on common APIs of NAAL. NAAL has the advantage of enabling additional support for new heterogeneous IoT devices and neuromorphic architectures and computing devices in the future.

An Efficient List Scheduling Algorithm in Distributed Heterogeneous Computing System (분산 이기종 컴퓨팅 시스템에서 효율적인 리스트 스케줄링 알고리즘)

  • Yoon, Wan-Oh;Yoon, Jung-Hee;Lee, Chang-Ho;Gim, Hyo-Gi;Choi, Sang-Bang
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.3
    • /
    • pp.86-95
    • /
    • 2009
  • Efficient DAG scheduling is critical for achieving high performance in heterogeneous computing environments. Finding an optimal solution to the problem of scheduling an application modeled by a directed acyclic graph(DAG) onto a set of heterogeneous machines is known to be an NP-complete problem. In this paper we propose a new list scheduling algorithm, called the Heterogeneous Rank-Path Scheduling(HRPS) algorithm, to exploit all of a program's available parallelism in distributed heterogeneous computing system. The primary goal of HRPS is to minimize the schedule length of applications. The performance of the algorithm has been observed by its application to some practical DAGs, and by comparing it with other existing scheduling algorithm such as CPOP, HCPT and FLB in term of the schedule length. The comparison studies show that HRPS significantly outperform CPOP, HCPT and FLB in schedule length.

Efficient Task Distribution for Pig Monitoring Applications Using OpenCL (OpenCL을 이용한 돈사 감시 응용의 효율적인 태스크 분배)

  • Kim, Jinseong;Choi, Younchang;Kim, Jaehak;Chung, Yeonwoo;Chung, Yongwha;Park, Daihee;Kim, Hakjae
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.10
    • /
    • pp.407-414
    • /
    • 2017
  • Pig monitoring applications consisting of many tasks can take advantage of inherent data parallelism and enable parallel processing using performance accelerators. In this paper, we propose a task distribution method for pig monitoring applications into a heterogenous computing platform consisting of a multicore-CPU and a manycore-GPU. That is, a parallel program written in OpenCL is developed, and then the most suitable processor is determined based on the measured execution time of each task. The proposed method is simple but very effective, and can be applied to parallelize other applications consisting of many tasks on a heterogeneous computing platform consisting of a CPU and a GPU. Experimental results show that the performance of the proposed task distribution method on three different heterogeneous computing platforms can improve the performance of the typical GPU-only method where every tasks are executed on a deviceGPU by a factor of 1.5, 8.7 and 2.7, respectively.

A CPU and GPU Heterogeneous Computing Techniques for Fast Representation of Thin Features in Liquid Simulations (액체 시뮬레이션의 얇은 특징을 빠르게 표현하기 위한 CPU와 GPU 이기종 컴퓨팅 기술)

  • Kim, Jong-Hyun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.24 no.2
    • /
    • pp.11-20
    • /
    • 2018
  • We propose a new method particle-based method that explicitly preserves thin liquid sheets for animating liquids on CPU-GPU heterogeneous computing framework. Our primary contribution is a particle-based framework that splits at thin points and collapses at dense points to prevent the breakup of liquid on GPU. In contrast to existing surface tracking methods, the our method does not suffer from numerical diffusion or tangles, and robustly handles topology changes on CPU-GPU framework. The thin features are detected by examining stretches of distributions of neighboring particles by performing PCA(Principle component analysis), which is used to reconstruct thin surfaces with anisotropic kernels. The efficiency of the candidate position extraction process to calculate the position of the fluid particle was rapidly improved based on the CPU-GPU heterogeneous computing techniques. Proposed algorithm is intuitively implemented, easy to parallelize and capable of producing quickly detailed thin liquid animations.

Global Internet Computing Environment based on Java (자바를 기반으로 한 글로벌 인터넷 컴퓨팅 환경)

  • Kim, Hui-Cheol;Sin, Pil-Seop;Park, Yeong-Jin;Lee, Yong-Du
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2320-2331
    • /
    • 1999
  • Over the Internet, in order to utilize a collection of idle computers as a parallel computing platform, we propose a new scheme called GICE(Global Internet Computing Environment). GICE is motivated to obtain high programmability, efficient support for heterogeneous computing resources, system scalability, and finally high performance. The programming model of GICE is based on a single address space. GICE is featured with a Java based programming environment, a dynamic resource management scheme, and efficient parallel task scheduling and execution mechanisms. Based on a prototype implementation of GICE, we address the concept, feasibility, complexity and performance of Internet computing.

  • PDF