• Title/Summary/Keyword: multicore architecture

Search Result 24, Processing Time 0.028 seconds

Performance Study of Multicore Digital Signal Processor Architectures (멀티코어 디지털 신호처리 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.4
    • /
    • pp.171-177
    • /
    • 2013
  • Due to the demand for high speed 3D graphic rendering, video file format conversion, compression, encryption and decryption technologies, the importance of digital signal processor system is growing rapidly. In order to satisfy the real-time constraints, high performance digital signal processor is required. Therefore, as in general purpose computer systems, digital signal processor should be designed as multicore architecture as well. Using UTDSP benchmarks as input, the trace-driven simulation has been performed and analyzed for the 2 to 16-core digital signal processor architectures with the cores from simple RISC to in-order and out-of-order superscalar processors for the various window sizes, extensively.

A Computer Architecture Education Framework in IT Convergence Services Era (IT융합 서비스 환경을 위한 컴퓨터 아키텍쳐 교육 프레임워크)

  • Choi, Chang Yeol;Choi, Hwang Kyu
    • Journal of Information Technology and Architecture
    • /
    • v.10 no.1
    • /
    • pp.23-31
    • /
    • 2013
  • A rapid growth of IT convergence into different application areas draws a lot of interest in high performance platform and embedded system. Industry needs well educated computer professionals with the practical understanding on the emerging technologies and core issues of contemporary popular services. In this paper, we present an education framework for computer system architecture based on rigorous analyses of the characteristics of IT convergence services and information technology trends. The proposed framework puts emphasis on real-world and hands-on subjects related to multicore architecture, embedded system and parallel processing. We believe effective use in the development and management of computer system architecture courses encouraging both industries and students.

User Experience Assisted Energy-Efficient Software Design for Mobile Devices on the big.LITTLE Core Architecture (사용자 경험을 기반으로 big.LITTLE 멀티코어 구조의 스마트 모바일 단말의 에너지 소비를 최적화 하는 소프트웨어 구조 설계)

  • Lim, Sung-Hwa
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.1
    • /
    • pp.23-28
    • /
    • 2020
  • In Smart mobile devices embedding big.LITTLE architectures, the conventional multi-core assignment scheme for user applications may incur wasteful energy consumption and long response time. In this paper, we propose a user experience assisted energy-efficient multicore assignment scheme. Our simulation results show that the proposed scheme achieves at 40% less energy consumption and at 20% less response time comparing to the legacy scheme.

A Performance Study of Embedded Multicore Processor Architectures (임베디드 멀티코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.163-169
    • /
    • 2013
  • Recently, the importance of embedded system is growing rapidly. In-order to satisfy the real-time constraints of the system, high performance embedded processor is required. Therefore, as in general purpose computer systems, embedded processor should be designed as multicore architecture as well. Using MiBench benchmarks as input, the trace-driven simulation has been performed and analyzed for the 2-core to 16-core embedded processor architectures with different types of cores from simple RISC to in-order and out-of-order superscalar processors, extensively. As a result, the achievable performance is as high as 23 times over the single core embedded RISC processor.

Implementation of the SIMT based Image Signal Processor for the Image Processing (영상처리를 위한 SIMT 기반 Image Signal Processor 구현)

  • Hwang, Yun-Seop;Jeon, Hee-Kyeong;Lee, Kwan-ho;Lee, Kwang-yeob
    • Journal of IKEEE
    • /
    • v.20 no.1
    • /
    • pp.89-93
    • /
    • 2016
  • In this paper, we proposed SIMT based Image Signal Processor which can apply various image preprocessing algorithms and allow parallel processing of application programs such as image recognition. Conventional ISP has the hard-wired image enhancement algorithm of which the processing speed is fast, but there was difficult to optimize performance depending on various image processing algorithms. The proposed ISP improved the processing time applying SIMT architecture and processed a variety of image processing algorithms as an instruction based processor. We used Xilinx Virtex-7 board and the processing time compared to cell multicore processor, ARM Cortex-A9, ARM Cortex-A15 was reduced by about 71 percent, 63 percent and 33 percent, respectively.

Parallelization of Multifrontal Solution Method for Shared Memory Architecture (다중프론트 해법의 공유메모리 병렬화)

  • Kim, Min Ki;Kim, Jeong Ho;Park, Chan Yik;Kim, Seung Jo
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.40 no.11
    • /
    • pp.972-978
    • /
    • 2012
  • This paper discusses the parallelization of multifrontal solution method, widely used for finite element structural analyses, for a shared memory architecture. Multifrontal method is easier than other linear solution methods because the solution procedure implies that unknowns can be eliminated simultaneously. Two innovative ideas are introduced to achieve optimal solver performance on a shared memory computer. Those are pairing two frontal matrices and splitting the frontal matrix in order to reduce the temporal memory space required by independent computing tasks. Performance comparisons between original algorithm and proposed one prove that proposed method is more computationally efficient on current multicore machines.

Design Space Exploration for NoC-Style Bus Networks

  • Kim, Jin-Sung;Lee, Jaesung
    • ETRI Journal
    • /
    • v.38 no.6
    • /
    • pp.1240-1249
    • /
    • 2016
  • With the number of IP cores in a multicore system-on-chip increasing to up to tens or hundreds, the role of on-chip interconnection networks is vital. We propose a networks-on-chip-style bus network as a compromise and redefine the exploration problem to find the best IP tiling patterns and communication path combinations. Before solving the problem, we estimate the time complexity and validate the infeasibility of the solution. To reduce the time complexity, we propose two fast exploration algorithms and develop a program to implement these algorithms. The program is executed for several experiments, and the exploration time is reduced to approximately 1/22 and 7/1,200 at the first and second steps of the exploration process, respectively. However, as a trade-off for the time saving, the time cost (TC) of the searched architecture is increased to up to 4.7% and 11.2%, respectively, at each step compared with that of the architecture obtained through full-case exploration. The reduction ratio can be decreased to 1/4,000 by simultaneously applying both the algorithms even though the resulting TC is increased to up to 13.1% when compared with that obtained through full-case exploration.

Study on LLVM application in Parallel Computing System (병렬 컴퓨팅 시스템에서 LLVM 응용 연구)

  • Cho, Jungseok;Cho, Doosan;Kim, Yongyeon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.5 no.1
    • /
    • pp.395-399
    • /
    • 2019
  • In order to support various parallel computing systems, it is necessary to extend LLVM IR to more efficiently support vector / matrix and to design LLVM IR to machine code as a new algorithm. As shown in the IR example, RISC instruction generation is naturally generated because the RISC instruction is basically composed of the RISC instruction, and the vector instruction is also not supported. There is a need for new IR structures, command generation algorithms and related extensions to support vector / matrix more robustly. To do this, it is important to map each instruction in the LLVM IR to the appropriate instruction in the target architecture (vector / matrix) (instruction selection algorithm). It is necessary to understand the meaning of LLVM IR command, to compare the meaning of each instruction of the target architecture with syntax, and to select the instruction that matches the pattern to make mapping efficient.

Dynamic Directory Table: On-Demand Allocation of Directory Entries for Active Shared Cache Blocks (동적 디렉터리 테이블 : 공유 캐시 블록의 디렉터리 엔트리 동적 할당)

  • Bae, Han Jun;Choi, Lynn
    • Journal of KIISE
    • /
    • v.44 no.12
    • /
    • pp.1245-1251
    • /
    • 2017
  • In this study we present a novel directory architecture that can dynamically allocate a directory entry for a cache block on demand at runtime only when the block is shared by more than one core. Thus, we do not maintain coherence for private blocks, substantially reducing the number of directory entries. Even for shared blocks, we allocate directory entry dynamically only when the block is actively shared, further reducing the number of directory entries at runtime. For this, we propose a new directory architecture called dynamic directory table (DDT), which is implemented as a cache of active directory entries. Through our detailed simulation on PARSEC benchmarks, we show that DDT can outperform the expensive full-map directory by a slight margin with only 17.84% of directory area across a variety of different workloads. This is achieved by its faster access and high hit rates in the small directory. In addition, we demonstrate that even smaller DDTs can give comparable or higher performance compared to recent directory optimization schemes such as SPACE and DGD with considerably less area.

A Performance Study of Asymmetric Embedded Multi-Core Processors (비대칭적 임베디드 멀티코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.233-238
    • /
    • 2016
  • Recently, the multi-core processor architecture is widely adopted in the embedded processors for enhancing its performance. Multi-core processors are classified either as symmetric or asymmetric. Asymmetric multicore processors are known to score higher performance and more efficient than symmetric multi-core processors. In order to study the performance enhancement of asymmetric multi-core embedded processors over the symmetric ones, the trace-driven simulation has been executed for various asymmetric embedded dual-core, quad-core, octa-core and hexadeca-core processors and compared with the symmetric ones of similar hardware budget using MiBench benchmarks as input.