• Title/Summary/Keyword: heterogeneous computing

Search Result 402, Processing Time 0.023 seconds

Analysis of Implementing Mobile Heterogeneous Computing for Image Sequence Processing

  • BAEK, Aram;LEE, Kangwoon;KIM, Jae-Gon;CHOI, Haechul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.4948-4967
    • /
    • 2017
  • On mobile devices, image sequences are widely used for multimedia applications such as computer vision, video enhancement, and augmented reality. However, the real-time processing of mobile devices is still a challenge because of constraints and demands for higher resolution images. Recently, heterogeneous computing methods that utilize both a central processing unit (CPU) and a graphics processing unit (GPU) have been researched to accelerate the image sequence processing. This paper deals with various optimizing techniques such as parallel processing by the CPU and GPU, distributed processing on the CPU, frame buffer object, and double buffering for parallel and/or distributed tasks. Using the optimizing techniques both individually and combined, several heterogeneous computing structures were implemented and their effectiveness were analyzed. The experimental results show that the heterogeneous computing facilitates executions up to 3.5 times faster than CPU-only processing.

Toward High Utilization of Heterogeneous Computing Resources in SNP Detection

  • Lim, Myungeun;Kim, Minho;Jung, Ho-Youl;Kim, Dae-Hee;Choi, Jae-Hun;Choi, Wan;Lee, Kyu-Chul
    • ETRI Journal
    • /
    • v.37 no.2
    • /
    • pp.212-221
    • /
    • 2015
  • As the amount of re-sequencing genome data grows, minimizing the execution time of an analysis is required. For this purpose, recent computing systems have been adopting both high-performance coprocessors and host processors. However, there are few applications that efficiently utilize these heterogeneous computing resources. This problem equally refers to the work of single nucleotide polymorphism (SNP) detection, which is one of the bottlenecks in genome data processing. In this paper, we propose a method for speeding up an SNP detection by enhancing the utilization of heterogeneous computing resources often used in recent high-performance computing systems. Through the measurement of workload in the detection procedure, we divide the SNP detection into several task groups suitable for each computing resource. These task groups are scheduled using a window overlapping method. As a result, we improved upon the speedup achieved by previous open source applications by a magnitude of 10.

A Performance Comparison of Parallel Programming Models on Edge Devices (엣지 디바이스에서의 병렬 프로그래밍 모델 성능 비교 연구)

  • Dukyun Nam
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.4
    • /
    • pp.165-172
    • /
    • 2023
  • Heterogeneous computing is a technology that utilizes different types of processors to perform parallel processing. It maximizes task processing and energy efficiency by leveraging various computing resources such as CPUs, GPUs, and FPGAs. On the other hand, edge computing has developed with IoT and 5G technologies. It is a distributed computing that utilizes computing resources close to clients, thereby offloading the central server. It has evolved to intelligent edge computing combined with artificial intelligence. Intelligent edge computing enables total data processing, such as context awareness, prediction, control, and simple processing for the data collected on the edge. If heterogeneous computing can be successfully applied in the edge, it is expected to maximize job processing efficiency while minimizing dependence on the central server. In this paper, experiments were conducted to verify the feasibility of various parallel programming models on high-end and low-end edge devices by using benchmark applications. We analyzed the performance of five parallel programming models on the Raspberry Pi 4 and Jetson Orin Nano as low-end and high-end devices, respectively. In the experiment, OpenACC showed the best performance on the low-end edge device and OpenSYCL on the high-end device due to the stability and optimization of system libraries.

Efficient Workload Distribution of Photomosaic Using OpenCL into a Heterogeneous Computing Environment (이기종 컴퓨팅 환경에서 OpenCL을 사용한 포토모자이크 응용의 효율적인 작업부하 분배)

  • Kim, Heegon;Sa, Jaewon;Choi, Dongwhee;Kim, Haelyeon;Lee, Sungju;Chung, Yongwha;Park, Daihee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.8
    • /
    • pp.245-252
    • /
    • 2015
  • Recently, parallel processing methods with accelerator have been introduced into a high performance computing and a mobile computing. The photomosaic application can be parallelized by using inherent data parallelism and accelerator. In this paper, we propose a way to distribute the workload of the photomosaic application into a CPU and GPU heterogeneous computing environment. That is, the photomosaic application is parallelized using both CPU and GPU resource with the asynchronous mode of OpenCL, and then the optimal workload distribution rate is estimated by measuring the execution time with CPU-only and GPU-only distribution rates. The proposed approach is simple but very effective, and can be applied to parallelize other applications on a CPU and GPU heterogeneous computing environment. Based on the experimental results, we confirm that the performance is improved by 141% into a heterogeneous computing environment with the optimal workload distribution compared with using GPU-only method.

Development of Computational Science Simulation Management Program in Heterogeneous Computing Environments (이종 컴퓨팅 환경에서의 계산과학 시뮬레이션 관리 프로그램 개발)

  • Byun, Hee-Jung;Yu, Jung-Lok
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.9-17
    • /
    • 2018
  • Heterogeneous high performance computing systems are gaining acceptance as the environments for computational scientific simulations of various application fields. Those computing systems, however, have been mostly used with the legacy consoles, resulting in the severe decrement of accessibility and usability of heterogeneous computing assets. To solve this problem, this paper presents the design and implementation of web-based computational science simulation management program. The proposed program provides fundamental primitives including user authentication, data management, physical/virtual computing resource management, job management, etc. that can be used to manage different kinds of simulations efficiently, and also offers highly extensible feature through a modular plug-in architecture. We also present the best practical examples of applications (e.g., scientific simulation education and bio-medical) to confirm our program's effectiveness.

Resource Allocation for Heterogeneous Service in Green Mobile Edge Networks Using Deep Reinforcement Learning

  • Sun, Si-yuan;Zheng, Ying;Zhou, Jun-hua;Weng, Jiu-xing;Wei, Yi-fei;Wang, Xiao-jun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.7
    • /
    • pp.2496-2512
    • /
    • 2021
  • The requirements for powerful computing capability, high capacity, low latency and low energy consumption of emerging services, pose severe challenges to the fifth-generation (5G) network. As a promising paradigm, mobile edge networks can provide services in proximity to users by deploying computing components and cache at the edge, which can effectively decrease service delay. However, the coexistence of heterogeneous services and the sharing of limited resources lead to the competition between various services for multiple resources. This paper considers two typical heterogeneous services: computing services and content delivery services, in order to properly configure resources, it is crucial to develop an effective offloading and caching strategies. Considering the high energy consumption of 5G base stations, this paper considers the hybrid energy supply model of traditional power grid and green energy. Therefore, it is necessary to design a reasonable association mechanism which can allocate more service load to base stations rich in green energy to improve the utilization of green energy. This paper formed the joint optimization problem of computing offloading, caching and resource allocation for heterogeneous services with the objective of minimizing the on-grid power consumption under the constraints of limited resources and QoS guarantee. Since the joint optimization problem is a mixed integer nonlinear programming problem that is impossible to solve, this paper uses deep reinforcement learning method to learn the optimal strategy through a lot of training. Extensive simulation experiments show that compared with other schemes, the proposed scheme can allocate resources to heterogeneous service according to the green energy distribution which can effectively reduce the traditional energy consumption.

Performance Improvement of BLAST using Grid Computing and Implementation of Genome Sequence Analysis System (그리드 컴퓨팅을 이용한 BLAST 성능개선 및 유전체 서열분석 시스템 구현)

  • Kim, Dong-Wook;Choi, Han-Suk
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.7
    • /
    • pp.81-87
    • /
    • 2010
  • This paper proposes a G-BLAST(BLAST using Grid Computing) system, an integrated software package for BLAST searches operated in heterogeneous distributed environment. G-BLAST employed 'database splicing' method to improve the performance of BLAST searches using exists computing resources. G-BLAST is a basic local alignment search tool of DNA Sequence using grid computing in heterogeneous distributed environment. The G-BLAST improved the existing BLAST search performance in gene sequence analysis. Also G-BLAST implemented the pipeline and data management method for users to easily manage and analyze the BLAST search results. The proposed G-BLAST system has been confirmed the speed and efficiency of BLAST search performance in heterogeneous distributed computing.

A Multi-Class Task Scheduling Strategy for Heterogeneous Distributed Computing Systems

  • El-Zoghdy, S.F.;Ghoneim, Ahmed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.1
    • /
    • pp.117-135
    • /
    • 2016
  • Performance enhancement is one of the most important issues in high performance distributed computing systems. In such computing systems, online users submit their jobs anytime and anywhere to a set of dynamic resources. Jobs arrival and processes execution times are stochastic. The performance of a distributed computing system can be improved by using an effective load balancing strategy to redistribute the user tasks among computing resources for efficient utilization. This paper presents a multi-class load balancing strategy that balances different classes of user tasks on multiple heterogeneous computing nodes to minimize the per-class mean response time. For a wide range of system parameters, the performance of the proposed multi-class load balancing strategy is compared with that of the random distribution load balancing, and uniform distribution load balancing strategies using simulation. The results show that, the proposed strategy outperforms the other two studied strategies in terms of average task response time, and average computing nodes utilization.

Accelerating Distance Transform Image based Hand Detection using CPU-GPU Heterogeneous Computing

  • Yi, Zhaohua;Hu, Xiaoqi;Kim, Eung Kyeu;Kim, Kyung Ki;Jang, Byunghyun
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.16 no.5
    • /
    • pp.557-563
    • /
    • 2016
  • Most of the existing hand detection methods rely on the contour shape of hand after skin color segmentation. Such contour shape based computations, however, are not only susceptible to noise and other skin color segments but also inherently sequential and difficult to efficiently parallelize. In this paper, we implement and accelerate our in-house distance image based approach using CPU-GPU heterogeneous computing. Using emerging CPU-GPU heterogeneous computing technology, we achieved 5.0 times speed-up for $320{\times}240$ images, and 17.5 times for $640{\times}480$ images and our experiment demonstrates that our proposed distance image based hand detection is robust and fast, reaching up to 97.32% palm detection rate, 80.4% of which have more than 3 fingers detected on commodity processors.

Parallel LDPC Decoding on a Heterogeneous Platform using OpenCL

  • Hong, Jung-Hyun;Park, Joo-Yul;Chung, Ki-Seok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2648-2668
    • /
    • 2016
  • Modern mobile devices are equipped with various accelerated processing units to handle computationally intensive applications; therefore, Open Computing Language (OpenCL) has been proposed to fully take advantage of the computational power in heterogeneous systems. This article introduces a parallel software decoder of Low Density Parity Check (LDPC) codes on an embedded heterogeneous platform using an OpenCL framework. The LDPC code is one of the most popular and strongest error correcting codes for mobile communication systems. Each step of LDPC decoding has different parallelization characteristics. In the proposed LDPC decoder, steps suitable for task-level parallelization are executed on the multi-core central processing unit (CPU), and steps suitable for data-level parallelization are processed by the graphics processing unit (GPU). To improve the performance of OpenCL kernels for LDPC decoding operations, explicit thread scheduling, vectorization, and effective data transfer techniques are applied. The proposed LDPC decoder achieves high performance and high power efficiency by using heterogeneous multi-core processors on a unified computing framework.