• 제목/요약/키워드: heterogeneous computing

검색결과 402건 처리시간 0.021초

Analysis of Implementing Mobile Heterogeneous Computing for Image Sequence Processing

  • BAEK, Aram;LEE, Kangwoon;KIM, Jae-Gon;CHOI, Haechul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권10호
    • /
    • pp.4948-4967
    • /
    • 2017
  • On mobile devices, image sequences are widely used for multimedia applications such as computer vision, video enhancement, and augmented reality. However, the real-time processing of mobile devices is still a challenge because of constraints and demands for higher resolution images. Recently, heterogeneous computing methods that utilize both a central processing unit (CPU) and a graphics processing unit (GPU) have been researched to accelerate the image sequence processing. This paper deals with various optimizing techniques such as parallel processing by the CPU and GPU, distributed processing on the CPU, frame buffer object, and double buffering for parallel and/or distributed tasks. Using the optimizing techniques both individually and combined, several heterogeneous computing structures were implemented and their effectiveness were analyzed. The experimental results show that the heterogeneous computing facilitates executions up to 3.5 times faster than CPU-only processing.

Toward High Utilization of Heterogeneous Computing Resources in SNP Detection

  • Lim, Myungeun;Kim, Minho;Jung, Ho-Youl;Kim, Dae-Hee;Choi, Jae-Hun;Choi, Wan;Lee, Kyu-Chul
    • ETRI Journal
    • /
    • 제37권2호
    • /
    • pp.212-221
    • /
    • 2015
  • As the amount of re-sequencing genome data grows, minimizing the execution time of an analysis is required. For this purpose, recent computing systems have been adopting both high-performance coprocessors and host processors. However, there are few applications that efficiently utilize these heterogeneous computing resources. This problem equally refers to the work of single nucleotide polymorphism (SNP) detection, which is one of the bottlenecks in genome data processing. In this paper, we propose a method for speeding up an SNP detection by enhancing the utilization of heterogeneous computing resources often used in recent high-performance computing systems. Through the measurement of workload in the detection procedure, we divide the SNP detection into several task groups suitable for each computing resource. These task groups are scheduled using a window overlapping method. As a result, we improved upon the speedup achieved by previous open source applications by a magnitude of 10.

엣지 디바이스에서의 병렬 프로그래밍 모델 성능 비교 연구 (A Performance Comparison of Parallel Programming Models on Edge Devices)

  • 남덕윤
    • 대한임베디드공학회논문지
    • /
    • 제18권4호
    • /
    • pp.165-172
    • /
    • 2023
  • Heterogeneous computing is a technology that utilizes different types of processors to perform parallel processing. It maximizes task processing and energy efficiency by leveraging various computing resources such as CPUs, GPUs, and FPGAs. On the other hand, edge computing has developed with IoT and 5G technologies. It is a distributed computing that utilizes computing resources close to clients, thereby offloading the central server. It has evolved to intelligent edge computing combined with artificial intelligence. Intelligent edge computing enables total data processing, such as context awareness, prediction, control, and simple processing for the data collected on the edge. If heterogeneous computing can be successfully applied in the edge, it is expected to maximize job processing efficiency while minimizing dependence on the central server. In this paper, experiments were conducted to verify the feasibility of various parallel programming models on high-end and low-end edge devices by using benchmark applications. We analyzed the performance of five parallel programming models on the Raspberry Pi 4 and Jetson Orin Nano as low-end and high-end devices, respectively. In the experiment, OpenACC showed the best performance on the low-end edge device and OpenSYCL on the high-end device due to the stability and optimization of system libraries.

이기종 컴퓨팅 환경에서 OpenCL을 사용한 포토모자이크 응용의 효율적인 작업부하 분배 (Efficient Workload Distribution of Photomosaic Using OpenCL into a Heterogeneous Computing Environment)

  • 김희곤;사재원;최동휘;김혜련;이성주;정용화;박대희
    • 정보처리학회논문지:컴퓨터 및 통신 시스템
    • /
    • 제4권8호
    • /
    • pp.245-252
    • /
    • 2015
  • 최근 고성능 컴퓨팅과 모바일 컴퓨팅에서 성능가속기를 사용하는 병렬처리 방법들이 소개되어왔다. 포토모자이크 응용은 내재된 데이터 병렬성을 활용하고 성능가속기를 사용하여 병렬처리가 가능하다. 본 논문에서는 CPU와 GPU로 구성된 이기종 컴퓨팅 환경에서 포토모자이크 수행 시 작업부하 분배 방법을 제안한다. 즉, 포토모자이크 응용을 비동기 방식으로 병렬화하여 CPU와 GPU 자원을 동시에 활용하고, 각 처리기에 할당할 최적의 작업부하량을 예측하기 위해 CPU-only와 GPU-only 작업 분배 환경에서 수행시간을 측정한다. 제안 방법은 간단하지만 매우 효과적이고, CPU와 GPU로 구성된 이기종 컴퓨팅 환경에서 다른 응용을 병렬화하 데에도 적용될 수 있다. 실험 결과, 이기종 컴퓨팅 환경에서 최적의 작업 분배량으로 수행한 경우, GPU-only의 방법과 비교하여 141%의 성능이 개선되었음을 확인한다.

이종 컴퓨팅 환경에서의 계산과학 시뮬레이션 관리 프로그램 개발 (Development of Computational Science Simulation Management Program in Heterogeneous Computing Environments)

  • 변희정;유정록
    • 한국콘텐츠학회논문지
    • /
    • 제18권8호
    • /
    • pp.9-17
    • /
    • 2018
  • 이종의 고성능 컴퓨팅 시스템은 최근 다양한 분야의 계산과학 시뮬레이션 처리 도구로 큰 각광을 받고 있다. 그러나 고성능 컴퓨팅 자원에 대한 활용 방법이 콘솔 기반으로 제공되기 때문에 컴퓨팅 자원에 대한 접근성과 활용성이 크게 떨어진다. 본 연구는 이러한 문제점을 해결하기 위해, 웹 기반의 이종 계산 자원 및 시뮬레이션 작업 관리 프로그램 개발에 대해 기술한다. 제안한 계산과학 시뮬레이션 관리 프로그램은 물리 가상 계산 자원 제어뿐만 아니라 사용자 인증, 데이터 관리, 시뮬레이션 작업 관리 등의 기능을 제공하며, 모듈식 플러그인 구조 설계를 통해 고도의 확장성을 가진다. 다분야 계산과학공학 교육 및 생명의료분야 적용 사례를 통해 그 우수성을 확인한다.

Resource Allocation for Heterogeneous Service in Green Mobile Edge Networks Using Deep Reinforcement Learning

  • Sun, Si-yuan;Zheng, Ying;Zhou, Jun-hua;Weng, Jiu-xing;Wei, Yi-fei;Wang, Xiao-jun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권7호
    • /
    • pp.2496-2512
    • /
    • 2021
  • The requirements for powerful computing capability, high capacity, low latency and low energy consumption of emerging services, pose severe challenges to the fifth-generation (5G) network. As a promising paradigm, mobile edge networks can provide services in proximity to users by deploying computing components and cache at the edge, which can effectively decrease service delay. However, the coexistence of heterogeneous services and the sharing of limited resources lead to the competition between various services for multiple resources. This paper considers two typical heterogeneous services: computing services and content delivery services, in order to properly configure resources, it is crucial to develop an effective offloading and caching strategies. Considering the high energy consumption of 5G base stations, this paper considers the hybrid energy supply model of traditional power grid and green energy. Therefore, it is necessary to design a reasonable association mechanism which can allocate more service load to base stations rich in green energy to improve the utilization of green energy. This paper formed the joint optimization problem of computing offloading, caching and resource allocation for heterogeneous services with the objective of minimizing the on-grid power consumption under the constraints of limited resources and QoS guarantee. Since the joint optimization problem is a mixed integer nonlinear programming problem that is impossible to solve, this paper uses deep reinforcement learning method to learn the optimal strategy through a lot of training. Extensive simulation experiments show that compared with other schemes, the proposed scheme can allocate resources to heterogeneous service according to the green energy distribution which can effectively reduce the traditional energy consumption.

그리드 컴퓨팅을 이용한 BLAST 성능개선 및 유전체 서열분석 시스템 구현 (Performance Improvement of BLAST using Grid Computing and Implementation of Genome Sequence Analysis System)

  • 김동욱;최한석
    • 한국콘텐츠학회논문지
    • /
    • 제10권7호
    • /
    • pp.81-87
    • /
    • 2010
  • 본 논문에서는 현재 생물정보학 연구에서 가장 많이 사용하고 있는 BLAST의 문제점을 분석하고 이에 따른 해결책을 제시하기 위하여 그리드 컴퓨팅을 이용한 G-BLAST(Grid Computing을 이용한 Basic Local Alignment Search Tool)를 제안한다. 본 연구에서 제안하고 있는 G-BLAST을 이용한 시스템은 이기종 분산 환경에서 수행이 가능한 서열분석 통합 소프트웨어 패키지이며 기존 서열분석 서비스의 취약점인 검색 성능을 개선하여 BLAST 검색 기능을 강화 하였다. 또한, BLAST 결과를 사용자가 관리 및 분석이 용이하도록 데이터베이스 및 유전체 서열분석 서비스 시스템을 구현하였다. 본 논문에서는 G-BLAST시스템의 성능확인을 위하여 병렬컴퓨팅 성능테스트 기법을 도입하여 구현된 시스템을 기존 BLAST와 속도 및 효율부분에서 비교하여 성능개선을 확인하였으며 서열결과 분석에 필요한 자료를 사용자관점에서 제공해주고 있다.

A Multi-Class Task Scheduling Strategy for Heterogeneous Distributed Computing Systems

  • El-Zoghdy, S.F.;Ghoneim, Ahmed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권1호
    • /
    • pp.117-135
    • /
    • 2016
  • Performance enhancement is one of the most important issues in high performance distributed computing systems. In such computing systems, online users submit their jobs anytime and anywhere to a set of dynamic resources. Jobs arrival and processes execution times are stochastic. The performance of a distributed computing system can be improved by using an effective load balancing strategy to redistribute the user tasks among computing resources for efficient utilization. This paper presents a multi-class load balancing strategy that balances different classes of user tasks on multiple heterogeneous computing nodes to minimize the per-class mean response time. For a wide range of system parameters, the performance of the proposed multi-class load balancing strategy is compared with that of the random distribution load balancing, and uniform distribution load balancing strategies using simulation. The results show that, the proposed strategy outperforms the other two studied strategies in terms of average task response time, and average computing nodes utilization.

Accelerating Distance Transform Image based Hand Detection using CPU-GPU Heterogeneous Computing

  • Yi, Zhaohua;Hu, Xiaoqi;Kim, Eung Kyeu;Kim, Kyung Ki;Jang, Byunghyun
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제16권5호
    • /
    • pp.557-563
    • /
    • 2016
  • Most of the existing hand detection methods rely on the contour shape of hand after skin color segmentation. Such contour shape based computations, however, are not only susceptible to noise and other skin color segments but also inherently sequential and difficult to efficiently parallelize. In this paper, we implement and accelerate our in-house distance image based approach using CPU-GPU heterogeneous computing. Using emerging CPU-GPU heterogeneous computing technology, we achieved 5.0 times speed-up for $320{\times}240$ images, and 17.5 times for $640{\times}480$ images and our experiment demonstrates that our proposed distance image based hand detection is robust and fast, reaching up to 97.32% palm detection rate, 80.4% of which have more than 3 fingers detected on commodity processors.

Parallel LDPC Decoding on a Heterogeneous Platform using OpenCL

  • Hong, Jung-Hyun;Park, Joo-Yul;Chung, Ki-Seok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권6호
    • /
    • pp.2648-2668
    • /
    • 2016
  • Modern mobile devices are equipped with various accelerated processing units to handle computationally intensive applications; therefore, Open Computing Language (OpenCL) has been proposed to fully take advantage of the computational power in heterogeneous systems. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes on an embedded heterogeneous platform using an OpenCL framework. The LDPC code is one of the most popular and strongest error correcting codes for mobile communication systems. Each step of LDPC decoding has different parallelization characteristics. In the proposed LDPC decoder, steps suitable for task-level parallelization are executed on the multi-core central processing unit (CPU), and steps suitable for data-level parallelization are processed by the graphics processing unit (GPU). To improve the performance of OpenCL kernels for LDPC decoding operations, explicit thread scheduling, vectorization, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance and high power efficiency by using heterogeneous multi-core processors on a unified computing framework.