• Title/Summary/Keyword: 매니코어 프로세서

Search Result 30, Processing Time 0.029 seconds

Design Space Exploration of Many-Core Processors for Mobile Ultrasound Image Signal Processing (모바일 초음파 영상신호처리를 위한 매니코어 프로세서 디자인 공간 탐색)

  • Choi, Byong-Kook;Kim, Jong-Myon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.183-186
    • /
    • 2011
  • 본 논문에서는 모바일 초음파(mobile ultrasound) 영상신호의 빔포밍 알고리즘에서 요구되는 고성능 및 저전력을 만족시키는 매니코어 프로세서에 대한 디자인 공간 탐색 방법을 소개한다. 매니코어 프로세서의 디자인 공간 탐색을 위해 매니코어의 각 프로세싱 엘리먼트(Processing Element, PE)당 초음파 영상신호 데이터의 수를 변화시키는 실험을 통해 실행시간, 에너지 효율 및 시스템 면적 효율을 측정하고, 측정된 결과를 바탕으로 최적의 매니코어 프로세서 구조를 선택하였다.

Implementation of an Optimal Many-core Processor for Beamforming Algorithm of Mobile Ultrasound Image Signals (모바일 초음파 영상신호의 빔포밍 기법을 위한 최적의 매니코어 프로세서 구현)

  • Choi, Byong-Kook;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.8
    • /
    • pp.119-128
    • /
    • 2011
  • This paper introduces design space exploration of many-core processors that meet high performance and low power required by the beamforming algorithm of image signals of mobile ultrasound. For the design space exploration of the many-core processor, we mapped different number of ultrasound image data to each processing element of many-core, and then determined an optimal many-core processor architecture in terms of execution time, energy efficiency and area efficiency. Experimental results indicate that PE=4096 and 1024 provide the highest energy efficiency and area efficiency, respectively. In addition, PE=4096 achieves 46x and 10x better than TI DSP C6416, which is widely used for ultrasound image devices, in terms of energy efficiency and area efficiency, respectively.

Implementation of SIMD-based Many-Core Processor for Efficient Image Data Processing (효율적인 영상데이터 처리를 위한 SIMD기반 매니코어 프로세서 구현)

  • Choi, Byong-Kook;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.1
    • /
    • pp.1-9
    • /
    • 2011
  • Recently, as mobile multimedia devices are used more and more, the needs for high-performance and low-energy multimedia processors are increasing. Application-specific integrated circuits (ASIC) can meet the needed high performance for mobile multimedia, but they provide limited, if any, generality needed for various application requirements. DSP based systems can used for various types of applications due to their generality, but they require higher cost and energy consumption as well as less performance than ASICs. To solve this problem, this paper proposes a single instruction multiple data (SIMD) based many-core processor which supports high-performance and low-power image data processing while keeping generality. The proposed SIMD based many-core processor composed of 16 processing elements (PEs) exploits large data parallelism inherent in image data processing. Experimental results indicate that the proposed SIMD-based many-core processor higher performance (22 times better), energy efficiency (7 times better), and area efficiency (3 times better) than conversional commercial high-performance processors.

Enhancing the Performance of Multiple Parallel Applications using Heterogeneous Memory on the Intel's Next-Generation Many-core Processor (인텔 차세대 매니코어 프로세서에서의 다중 병렬 프로그램 성능 향상기법 연구)

  • Rho, Seungwoo;Kim, Seoyoung;Nam, Dukyun;Park, Geunchul;Kim, Jik-Soo
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.878-886
    • /
    • 2017
  • This paper discusses performance bottlenecks that may occur when executing high-performance computing MPI applications in the Intel's next generation many-core processor called Knights Landing(KNL), as well as effective resource allocation techniques to solve this problem. KNL is composed of a host processor to enable self-booting in addition to an existing accelerator consisting of a many-core processor, and it was released with a new type of on-package memory with improved bandwidth on top of existing DDR4 based memory. We empirically verified an improvement of the execution performance of multiple MPI applications and the overall system utilization ratio by studying a resource allocation method optimized for such new many-core processor architectures.

Design Space Exploration of Embedded Many-Core Processors for Real-Time Fire Feature Extraction (실시간 화재 특징 추출을 위한 임베디드 매니코어 프로세서의 디자인 공간 탐색)

  • Suh, Jun-Sang;Kang, Myeongsu;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.10
    • /
    • pp.1-12
    • /
    • 2013
  • This paper explores design space of many-core processors for a fire feature extraction algorithm. This paper evaluates the impact of varying the number of cores and memory sizes for the many-core processor and identifies an optimal many-core processor in terms of performance, energy efficiency, and area efficiency. In this study, we utilized 90 samples with dimensions of $256{\times}256$ (60 samples containing fire and 30 samples containing non-fire) for experiments. Experimental results using six different many-core architectures (PEs=16, 64, 256, 1,024, 4,096, and 16,384) and the feature extraction algorithm of fire indicate that the highest area efficiency and energy efficiency are achieved at PEs=1,024 and 4,096, respectively, for all fire/non-fire containing movies. In addition, all the six many-core processors satisfy the real-time requirement of 30 frames-per-second (30 fps) for the algorithm.

Parallel Implementation and Performance Evaluation of the SIFT Algorithm Using a Many-Core Processor (매니코어 프로세서를 이용한 SIFT 알고리즘 병렬구현 및 성능분석)

  • Kim, Jae-Young;Son, Dong-Koo;Kim, Jong-Myon;Jun, Heesung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.9
    • /
    • pp.1-10
    • /
    • 2013
  • In this paper, we implement the SIFT(Scale-Invariant Feature Transform) algorithm for feature point extraction using a many-core processor, and analyze the performance, area efficiency, and system area efficiency of the many-core processor. In addition, we demonstrate the potential of the proposed many-core processor by comparing the performance of the many-core processor with that of high-performance CPU and GPU(Graphics Processing Unit). Experimental results indicate that the accuracy result of the SIFT algorithm using the many-core processor was same as that of OpenCV. In addition, the many-core processor outperforms CPU and GPU in terms of execution time. Moreover, this paper proposed an optimal model of the SIFT algorithm on the many-core processor by analyzing energy efficiency and area efficiency for different octave sizes.

Implementation of an Optimal SIMD-based Many-core Processor for Sound Synthesis of Guitar (기타 음 합성을 위한 최적의 SIMD기반 매니코어 프로세서 구현)

  • Choi, Ji-Won;Kang, Myeong-Su;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.1
    • /
    • pp.1-10
    • /
    • 2012
  • Improving operating frequency of processors is no longer today's issues; a multiprocessor technique which integrates many processors has received increasing attention. Currently, high-performance processors that integrate 64 or 128 cores are developing for large data processing over 2, 4, or 8 processor cores. This paper proposes an optimal many-core processor for synthesizing guitar sounds. Unlike the previous research in which a processing element (PE) was assigned to support one of guitar strings, this paper evaluates the impacts of mapping different numbers of PEs to one guitar string in terms of performance and both area and energy efficiencies using architectural and workload simulations. Experimental results show that the maximum area energy efficiencies were achieved at PEs=24 and 96, respectively, for synthesizing guitar sounds with sampling rate of 44.1kHz and 16-bit quantization. The synthesized sounds were very similar to original guitar sounds in their spectra. In addition, the proposed many-core processor was 1,235 and 22 times better than TI TMS320C6416 in area and energy efficiencies, respectively.

Optimal Many-core Processor Architecture for Different Ultrasonic Image Resolutions (초음파 영상선호의 크기 변화에 따른 최적의 매니코어 프로세서 구조)

  • Kang, Seong-Mo;Kim, Jong-Myon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.13 no.1
    • /
    • pp.50-55
    • /
    • 2012
  • This paper proposes an optima] many-core processor architecture that meets the requirements of low power and high performance for different ultrasonic image resolutions in hand-held ultrasonic devices. To identify the optimal many-core architecture, seven different PE configurations are simulated for processing ultrasonic images in terms of execution performance and energy consumption. Experimental results indicate that the highest energy efficiencies are achieved at PEs=1,024, 64, and 256 for ultrasonic images at $256{\times}256$, $320{\times}240$, and $800{\times}480$ resolutions, respectively. In addition, the maximum area efficiencies are obtained at PEs=256 (for $256{\times}256$ and $800{\times}480$ image resolutions) and 64 (for $320{\times}240$ image resolution).

A Study on Optimizing LRU lock for Improving Parallel I/O Throughout in Manycore CPU Systems (매니코어 CPU 시스템에서의 병렬 I/O 성능 향상을 위한 LRU 최적화 기법 연구)

  • Byun, Eun-Kyu;Bang, Jiwoo;Gu, Gibeom;Oh, Kwang-Jin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.2-4
    • /
    • 2022
  • 매니코어 CPU 시스템에서의 병렬 I/O 는 현재의 리눅스 시스템의 LRU 관리 방법의 한계로 확장성에 문제를 가지고 있다. 본 연구에서는 이 문제를 해결했던 하기 위한 개선된 FinerLRU 를 제안한다. LRU 락을 최대 코어 개수만큼 증가시키고 세분화된 Lock 관리를 통해 버퍼 캐시를 사용하는 파일 시스템의 병렬 I/O 성능을 향상시킨다. 리눅스 5.18.11 에 제안한 방법을 구현하였으며, 64 개의 물리적 코어와 256 개의 논리적 코어를 가지는 Intel Knights Landing 프로세서를 이용한 실험을 통해 두 배 가량의 성능 향상을 얻을 수 있음을 확인하였다.