• Title/Summary/Keyword: Multi-core Processors

Search Result 84, Processing Time 0.024 seconds

Analysis on the Temperature of 3D Multi-core Processors according to Vertical Placement of Core and L2 Cache (코어와 L2 캐쉬의 수직적 배치 관계에 따른 3차원 멀티코어 프로세서의 온도 분석)

  • Son, Dong-Oh;Ahn, Jin-Woo;Park, Jae-Hyung;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.6
    • /
    • pp.1-10
    • /
    • 2011
  • In designing multi-core processors, interconnection delay is one of the major constraints in performance improvement. To solve this problem, the 3-dimensional integration technology has been adopted in designing multi-core processors. The 3D multi-core architecture can reduce the physical wire length by stacking cores vertically, leading to reduced interconnection delay and reduced power consumption. However, the power density of 3D multi-core architecture is increased significantly compared to the traditional 2D multi-core architecture, resulting in the increased temperature of the processor. In this paper, the floorplan methods which change the forms of vertical placement of the core and the level-2 cache are analyzed to solve the thermal problems in 3D multi-core processors. According to the experimental results, it is an effective way to reduce the temperature in the processor that the core and the level-2 cache are stacked adjacently. Compared to the floorplan where cores are stacked adjacently to each other, the floorplan where the core is stacked adjacently to the level-2 cache can reduce the temperature by 22% in the case of 4-layers, and by 13% in the case of 2-layers.

Cost-Aware Scheduling of Computation-Intensive Tasks on Multi-Core Server

  • Ding, Youwei;Liu, Liang;Hu, Kongfa;Dai, Caiyan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.11
    • /
    • pp.5465-5480
    • /
    • 2018
  • Energy-efficient task scheduling on multi-core server is a fundamental issue in green cloud computing. Multi-core processors are widely used in mobile devices, personal computers, and servers. Existing energy efficient task scheduling methods chiefly focus on reducing the energy consumption of the processor itself, and assume that the cores of the processor are controlled independently. However, the cores of some processors in the market are divided into several voltage islands, in each of which the cores must operate on the same status, and the cost of the server includes not only energy cost of the processor but also the energy of other components of the server and the cost of user waiting time. In this paper, we propose a cost-aware scheduling algorithm ICAS for computation intensive tasks on multi-core server. Tasks are first allocated to cores, and optimal frequency of each core is computed, and the frequency of each voltage island is finally determined. The experiments' results show the cost of ICAS is much lower than the existing method.

New Hypervisor Improving Network Performance for Multi-core CE Devices

  • Hong, Cheol-Ho;Park, Miri;Yoo, Seehwan;Yoo, Chuck
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.6 no.4
    • /
    • pp.231-241
    • /
    • 2011
  • Recently, system virtualization has been applied to consumer electronics (CE) such as smart mobile phones. Although multi-core processors have become a viable solution for complex applications of consumer electronics, the issue of utilizing multi-core resources in the virtualization layer has not been researched sufficiently. In this paper, we present a new hypervisor design and implementation for multi-core CE devices. We concretely describe virtualization methods for a multi-core processor and multi-core-related subsystems. We also analyze bottlenecks of network performance in a virtualization environment that supports multimedia applications and propose an efficient virtual interrupt distributor. Our new multi-core hypervisor improves network performance by 5.5 times as compared to a hypervisor without the virtual interrupt distributor.

An Interference Matrix Based Approach to Bounding Worst-Case Inter-Thread Cache Interferences and WCET for Multi-Core Processors

  • Yan, Jun;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.2
    • /
    • pp.131-140
    • /
    • 2011
  • Different cores typically share the last-level cache in a multi-core processor. Threads running on different cores may interfere with each other. Therefore, the multi-core worst-case execution time (WCET) analyzer must be able to safely and accurately estimate the worst-case inter-thread cache interference. This is not supported by current WCET analysis techniques that manly focus on single thread analysis. This paper presents a novel approach to analyze the worst-case cache interference and bounding the WCET for threads running on multi-core processors with shared L2 instruction caches. We propose to use an interference matrix to model inter-thread interference, on which basis we can calculate the worst-case inter-thread cache interference. Our experiments indicate that the proposed approach can give a worst-case bound less than 1%, as in benchmark fib-call, and an average 16.4% overestimate for threads running on a dual-core processor with shared-L2 cache. Our approach dramatically improves the accuracy of WCET overestimatation by on average 20.0% compared to work.

A Performance Study of Multi-Core Processors with Perceptrons (퍼셉트론을 이용하는 멀티코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.12
    • /
    • pp.1704-1709
    • /
    • 2014
  • In order to increase the performance of multi-core system processor architectures, the multi-thread branch predictor which speculatively fetches and allocates threads to each core should be highly accurate. In this paper, the perceptron based multi-thread branch predictor is proposed for the multi-core processor architectures. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 2 to 16-core architectures employing perceptron multi-thread branch predictor extensively. Its performance is compared with the architecture which utilizes the two-level adaptive multi-thread branch predictor.

OPENMP PARALLEL PERFORMANCE OF A CFD CODE ON MULTI-CORE SYSTEMS (멀티코어 시스템에서 쓰레드 수에 따른 CFD 코드의 OpenMP 병렬 성능)

  • Kim, J.K.;Jang, K.J.;Kim, T.Y.;Cho, D.R.;Kim, S.D.;Choi, J.Y.
    • Journal of computational fluids engineering
    • /
    • v.18 no.1
    • /
    • pp.83-90
    • /
    • 2013
  • OpenMP is becoming more and more useful as a simple parallel processing paradigm on SMP (Shared Memory Multi-Processors) computing environment with the development of multi-core processors. However, very few data is available publically regarding the OpenMP performance in CFD (Computational Fluid Dynamics). In the present study a CFD test suite is prepared for the performance evaluation of OpenMP on various multi-core systems. The test suite is composed of two-dimensional numerical simulations for inviscid/viscous and reacting/non-reacting flows using three different levels of grid systems. One to five test runs were carried out on various systems from dual-core dual threads to 16-core 32-threads systems by changing the number of threads engaged for each test up to 80. The results exhibit some interesting results and the lessons learned from the tests would be quite helpful for the further use of OpenMP for CFD studies using multi-core processor systems.

Implementation and Verification of a Multi-Core Processor including Multimedia Specific Instructions (멀티미디어 전용 명령어를 내장한 멀티코어 프로세서 구현 및 검증)

  • Seo, Jun-Sang;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.1
    • /
    • pp.17-24
    • /
    • 2013
  • In this paper, we present a multi-core processor including multimedia specific instructions to process multimedia data efficiently in the mobile environment. Multimedia specific instructions exploit subword level parallelism (SLP), while the multi-core processor exploits data level parallelism (DLP). These combined parallelisms improve the performance of multimedia processing applications. The proposed multi-core processor including multimedia specific instructions is implemented and tested using a Xilinx ISE 10.1 tool and SoCMaster3 testbed system including Vertex 4 FPGA. Experimental results using a fire detection algorithm show that multimedia specific instructions outperform baseline instructions in the same multi-core architecture in terms of performance (1.2x better), energy efficiency (1.37x better), and area efficiency (1.23x better).

A Performance Evaluation of Parallel Color Conversion based on the Thread Number on Multi-core Systems (멀티코어 시스템에서 쓰레드 수에 따른 병렬 색변환 성능 검증)

  • Kim, Cheong Ghil
    • Journal of Satellite, Information and Communications
    • /
    • v.9 no.4
    • /
    • pp.73-76
    • /
    • 2014
  • With the increasing popularity of multi-core processors, they have been adopted even in embedded systems. Under this circumstance many multimedia applications can be parallelized on multi-core platforms because they usually require heavy computations and extensive memory accesses. This paper proposes an efficient thread-level parallel implementation for color space conversion on multi-core CPU. Thread-level parallelism has been becoming very useful parallel processing paradigm especially on shared memory computing systems. In this work, it is exploited by allocating different input pixels to each thread for concurrent loop executions. For the performance evaluation, this paper evaluate the performace improvements for color conversion on multi-core processors based on the processing speed comparison between its serial implementation and parallel ones. The results shows that thread-level parallel implementations show the overall similar ratios of performance improvements regardless of different multi-cores.

Performance Study of Asymmetric Multicore Processor Architectures (비대칭적 멀티코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.14 no.3
    • /
    • pp.163-169
    • /
    • 2014
  • Recently, the importance of multicore processor system is growing rapidly. Multicore processors are classified either as symmetric or asymmetric. Asymmetric multicore processors consist of a high performance complex core and number of low performance simple cores, and are known to be more efficient than symmetric multicore processors. Therefore, performance impact on various configurations of asymmetric multi-core processor needs to be studied. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for different asymmetric quad-core and octa-core processors and compared to the corresponding symmetric ones.

Performance Analysis and Characterization of Multi-Core Servers (멀티-코어 서버의 성능 분석 및 특성화)

  • Lee, Myung-Ho;Kang, Jun-Suk
    • The KIPS Transactions:PartA
    • /
    • v.15A no.5
    • /
    • pp.259-268
    • /
    • 2008
  • Multi-Core processors have become main-stream microprocessors in recent years. Servers based on these multi-core processors are widely adopted in High Performance Computing (HPC) and commercial business applications as well. These servers provide increased level of parallelism, thus can potentially boost the performance for applications. However, the shared resources among multiple cores on the same chip can become hot spots and act as performance bottlenecks. Therefore it is essential to optimize the use of shared resources for high performance and scalability for the multi-core servers. In this paper, we conduct experimental studies to analyze the positive and negative effects of the resource sharing on the performance of HPC applications. Through the analyses we also characterize the performance of multi-core servers.