• Title/Summary/Keyword: Multi-Core

Search Result 1,168, Processing Time 0.03 seconds

A Study On Statistical Simulation for Asymmetric Multi-Core Processor Architectures (비대칭적 멀티코어 프로세서의 통계적 모의실험에 관한 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.2
    • /
    • pp.157-163
    • /
    • 2016
  • If trace-driven or execution-driven simulation is used for the performance analysis of asymmetric multi-core processors, excessive time and much disk space are necessary. In this paper, statistical simulations are performed for asymmetric multi-core processors with various hardware configurations. For the experiment, SPEC 2000 benchmark programs are used for profiling and synthesis, which is supplied as input for the simulation of asymmetric multi-core processors. As a result, the performance of asymmetric multi-core processor obtained by statistical simulation is comparable to that of the trace-driven simulation with a tremendous reduction in the simulation time.

Analysis of Job Scheduling and the Efficiency for Multi-core Mobile GPU (멀티코어형 모바일 GPU의 작업 분배 및 효율성 분석)

  • Lim, Hyojeong;Han, Donggeon;Kim, Hyungshin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.7
    • /
    • pp.4545-4553
    • /
    • 2014
  • Mobile GPU has led to the rapid development of smart phone graphic technology. Most recent smart phones are equipped with high-performance multi-core GPU. How a multi-core mobile GPU can be utilized efficiently will be a critical issue for improving the smart phone performance. On the other hand, most current research has focused on a single-core mobile GPU; studies of multi-core mobile GPU are rare. In this paper, the job scheduling patterns and the efficiency of multi-core mobile GPU are analyzed. In the profiling result, despite the higher number of GPU cores, the total processing time required for certain graphics applications were increased. In addition, when GPU is processing for 3D games, a substantial amount of overhead is caused by communication between not only the CPU and GPU, but also within the GPUs. These results confirmed that more active research for multi-core mobile GPU should be performed to optimize the present mobile GPUs.

Sojourn Time Analysis Using SRPT Scheduling for Heterogeneous Multi-core Systems (Heterogeneous 멀티코어 시스템에서 SRPT 스케줄링을 사용한 체류 시간 분석)

  • Yang, Bomi;Park, Hyunjae;Choi, Young-June
    • Journal of KIISE
    • /
    • v.44 no.3
    • /
    • pp.223-231
    • /
    • 2017
  • In this paper, we study the performance of recently popular multi-core systems in mobiles. Previous research on the multi-core performance usually focused on the desktop PC. However, there is enough scope to further analyze heterogeneous multi-core systems. Therefore, by extending homogeneous multi-core systems, we analyze the heterogeneous multi-core systems using Size Interval Task Allocation (SITA) for job allocation, and Shortest Remaining Processing Time (SRPT) scheduling, for each individual core. We propose a new computational method regarding the cutoff point, which is crucial in analyzing SITA, by calculating the sojourn time. This facilitate easy and accurate calculation of the sojourn time. We further confirm our analysis through the ESESC simulator that provides actual measurements.

Analysis on the Performance and Temperature of the 3D Quad-core Processor according to Cache Organization (캐쉬 구성에 따른 3차원 쿼드코어 프로세서의 성능 및 온도 분석)

  • Son, Dong-Oh;Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.6
    • /
    • pp.1-11
    • /
    • 2012
  • As the process technology scales down, multi-core processors cause serious problems such as increased interconnection delay, high power consumption and thermal problems. To solve the problems in 2D multi-core processors, researchers have focused on the 3D multi-core processor architecture. Compared to the 2D multi-core processor, the 3D multi-core processor decreases interconnection delay by reducing wire length significantly, since each core on different layers is connected using vertical through-silicon via(TSV). However, the power density in the 3D multi-core processor is increased dramatically compared to that in the 2D multi-core processor, because multiple cores are stacked vertically. Unfortunately, increased power density causes thermal problems, resulting in high cooling cost, negative impact on the reliability. Therefore, temperature should be considered together with performance in designing 3D multi-core processors. In this work, we analyze the temperature of the cache in quad-core processors varying cache organization. Then, we propose the low-temperature cache organization to overcome the thermal problems. Our evaluation shows that peak temperature of the instruction cache is lower than threshold. The peak temperature of the data cache is higher than threshold when the cache is composed of many ways. According to the results, our proposed cache organization not only efficiently reduces the peak temperature but also reduces the performance degradation for 3D quad-core processors.

Analysis on the Temperature of 3D Multi-core Processors according to Vertical Placement of Core and L2 Cache (코어와 L2 캐쉬의 수직적 배치 관계에 따른 3차원 멀티코어 프로세서의 온도 분석)

  • Son, Dong-Oh;Ahn, Jin-Woo;Park, Jae-Hyung;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.6
    • /
    • pp.1-10
    • /
    • 2011
  • In designing multi-core processors, interconnection delay is one of the major constraints in performance improvement. To solve this problem, the 3-dimensional integration technology has been adopted in designing multi-core processors. The 3D multi-core architecture can reduce the physical wire length by stacking cores vertically, leading to reduced interconnection delay and reduced power consumption. However, the power density of 3D multi-core architecture is increased significantly compared to the traditional 2D multi-core architecture, resulting in the increased temperature of the processor. In this paper, the floorplan methods which change the forms of vertical placement of the core and the level-2 cache are analyzed to solve the thermal problems in 3D multi-core processors. According to the experimental results, it is an effective way to reduce the temperature in the processor that the core and the level-2 cache are stacked adjacently. Compared to the floorplan where cores are stacked adjacently to each other, the floorplan where the core is stacked adjacently to the level-2 cache can reduce the temperature by 22% in the case of 4-layers, and by 13% in the case of 2-layers.

Efficient Fault-Recovery Technique for CGRA-based Multi-Core Architecture

  • Kim, Yoonjin;Sohn, Seungyeon
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.15 no.2
    • /
    • pp.307-311
    • /
    • 2015
  • In this paper, we propose an efficient fault-recovery technique for CGRA (Coarse-Grained Reconfigurable Architecture) based multi-core architecture. The proposed technique is intra/inter-CGRA co-reconfiguration technique based on a ring-based sharing fabric (RSF) and it enables exploiting the inherent redundancy and reconfigurability of the multi-CGRA for fault-recovery. Experimental results show that the proposed approaches achieve up to 73% fault recoverability when compared with completely connected fabric (CCF).

A Performance Study on Many-core Processor Architectures with SPEC Benchmark Programs (SPEC 벤치마크 프로그램에 대한 매니코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.2
    • /
    • pp.252-256
    • /
    • 2013
  • In order to overcome the complexity and performance limit problems of superscalar processors, the multi-core architecture has been prevalent recently. Usually, the number of cores mostly used for the multi-core processor architecture ranges from 2 to 16. However in the near future, more than 32-cores are likely to be utilized, which is called as many-core processor architecture. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 32 to 1024 many-core architectures extensively. For 1024-cores, the average performance scores 15.7 IPC, but the performance increase rate is saturated.

Cost-Aware Scheduling of Computation-Intensive Tasks on Multi-Core Server

  • Ding, Youwei;Liu, Liang;Hu, Kongfa;Dai, Caiyan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.11
    • /
    • pp.5465-5480
    • /
    • 2018
  • Energy-efficient task scheduling on multi-core server is a fundamental issue in green cloud computing. Multi-core processors are widely used in mobile devices, personal computers, and servers. Existing energy efficient task scheduling methods chiefly focus on reducing the energy consumption of the processor itself, and assume that the cores of the processor are controlled independently. However, the cores of some processors in the market are divided into several voltage islands, in each of which the cores must operate on the same status, and the cost of the server includes not only energy cost of the processor but also the energy of other components of the server and the cost of user waiting time. In this paper, we propose a cost-aware scheduling algorithm ICAS for computation intensive tasks on multi-core server. Tasks are first allocated to cores, and optimal frequency of each core is computed, and the frequency of each voltage island is finally determined. The experiments' results show the cost of ICAS is much lower than the existing method.

The Effects of Runner Core Pin on the Filling Imbalance Occurred in Multi Cavity Injection Mold (다수 캐비티 사출금형에서 러너 코어핀이 충전불균형에 미치는 영향)

  • Kang C. M.;Jeong Y. D.;Han K. T.
    • Proceedings of the Korean Society for Technology of Plasticity Conference
    • /
    • 2005.05a
    • /
    • pp.39-42
    • /
    • 2005
  • For mass production, usually injection mold has multi-cavity which is filled through geometrical balanced runner system. Despite geometrical balanced runner system, filling imbalances between cavity to cavity have always been observed. These filling imbalances are one of the most significant factors to affect quality of plastic parts when molding plastic parts in multi-cavity injection mold. Filling imbalances are results from non-symmetrical shear rate distribution within melt as it flows through the runner system. It has been possible to decrease filling imbalance by optimizing processing conditions, but it has not completely eliminated this phenomenon during injection molding processing. This paper presents a solution of these filling imbalances through using 'runner core pin'. The runner core pin which is developed in this study creates a symmetrical shear distribution within runner. As a result of using runner core pin, a remarkable improvement in reducing filling imbalance was confirmed.

  • PDF

Exploration of an Optimal Two-Dimensional Multi-Core System for Singular Value Decomposition (특이치 분해를 위한 최적의 2차원 멀티코어 시스템 탐색)

  • Park, Yong-Hun;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.9
    • /
    • pp.21-31
    • /
    • 2014
  • Singular value decomposition (SVD) has been widely used to identify unique features from a data set in various fields. However, a complex matrix calculation of SVD requires tremendous computation time. This paper improves the performance of a representative one-sided block Jacoby algorithm using a two-dimensional (2D) multi-core system. In addition, this paper explores an optimal multi-core system by varying the number of processing elements in the 2D multi-core system with the same 400MHz clock frequency and TSMC 28nm technology for each matrix-based one-sided block Jacoby algorithm ($128{\times}128$, $64{\times}64$, $32{\times}32$, $16{\times}16$). Moreover, this paper demonstrates the potential of the 2D multi-core system for the one-sided block Jacoby algorithm by comparing the performance of the multi-core system with a commercial high-performance graphics processing unit (GPU).