• Title/Summary/Keyword: Multi-core Processors

Search Result 84, Processing Time 0.02 seconds

Numerical Analysis on Separation Dynamics of Multi-stage Rocket System Using Parallelized Chimera Grid Scheme (병렬화된 Chimera 격자 기법을 이용한 다단 로켓의 단분리 운동 해석)

  • Ko Soon-Heum;Choi Seongjin;Kim Chongam;Rho Oh-Hyun;Park Jeong-joo
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2002.05a
    • /
    • pp.47-52
    • /
    • 2002
  • The supersonic flow around multi-stage rocket system is analyzed using 3-D compressible unsteady flow solver. A Chimera overset grid technique is used for the calculation of present configuration and grid around the core rocket is composed of 3 zones to represent fins in the core rocket. Flow solver is parallelized to reduce the computation time, and an efficient parallelization algorithm for Chimera grid technique is proposed. AUSMPW+ scheme is used for the spatial discretization and LU-SGS for the time integration. The flow field around multi-stage rocket was analyzed using this developed solver, and the results were compared with that of a sequential solver The speed-up ratio and the efficiency were measured in several processors. As a result, the computing speed with 12 processors was about 10 times faster than that of a sequential solver. Developed flow solver is used to predict the trajectory of booster in separation stage. From the analyses, booster collides against core rocket in free separation case. So, additional jettisoning forces and moments needed for a safe separation are examined.

  • PDF

Dynamic Power Management Framework for Mobile Multi-core System (모바일 멀티코어 시스템을 위한 동적 전력관리 프레임워크)

  • Ahn, Young-Ho;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.7
    • /
    • pp.52-60
    • /
    • 2010
  • In this paper, we propose a dynamic power management framework for multi-core systems. We reduced the power consumption of multi-core processors such as Intel Centrino Duo and ARM11 MPCore, which have been used at the consumer electronics and personal computer market. Each processor uses a different technique to save its power usage, but there is no embedded multi-core processor which has a precise power control mechanism such as dynamic voltage scaling technique. The proposed dynamic power management framework is suitable for smart phones which have an operating system to provide multi-processing capability. Basically, our framework follows an intuitive idea that reducing the power consumption of idle cores is the most effective way to save the overall power consumption of a multi-core processor. We could minimize the energy consumption used by idle cores with application-targeted policies that reflect the characteristics of active workloads. We defined some properties of an application to analyze the performance requirement in real time and automated the management process to verify the result quickly. We tested the proposed framework with popular processors such as Intel Centrino Duo and ARM11 MPCore, and were able to find that our framework dynamically reduced the power consumption of multi-core processors and satisfied the performance requirement of each program.

A Performance Study on Many-core Processor Architectures with SPEC Benchmark Programs (SPEC 벤치마크 프로그램에 대한 매니코어 프로세서의 성능 연구)

  • Lee, Jongbok
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.2
    • /
    • pp.252-256
    • /
    • 2013
  • In order to overcome the complexity and performance limit problems of superscalar processors, the multi-core architecture has been prevalent recently. Usually, the number of cores mostly used for the multi-core processor architecture ranges from 2 to 16. However in the near future, more than 32-cores are likely to be utilized, which is called as many-core processor architecture. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the 32 to 1024 many-core architectures extensively. For 1024-cores, the average performance scores 15.7 IPC, but the performance increase rate is saturated.

Multicore-Aware Code Co-Positioning to Reduce WCET on Dual-Core Processors with Shared Instruction Caches

  • Ding, Yiqiang;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.1
    • /
    • pp.12-25
    • /
    • 2012
  • For real-time systems it is important to obtain the accurate worst-case execution time (WCET). Furthermore, how to improve the WCET of applications that run on multicore processors is both significant and challenging as the WCET can be largely affected by the possible inter-core interferences in shared resources such as the shared L2 cache. In order to solve this problem, we propose an innovative approach that adopts a code positioning method to reduce the inter-core L2 cache interferences between the different real-time threads that adaptively run in a multi-core processor by using different strategies. The worst-case-oriented strategy is designed to decrease the worst-case WCET among these threads to as low as possible. The other two strategies aim at reducing the WCET of each thread to almost equal percentage or amount. Our experiments indicate that the proposed multicore-aware code positioning approaches, not only improve the worst-case performance of the real-time threads but also make good tradeoffs between efficiency and fairness for threads that run on multicore platforms.

Multi-Dimensional Record Scan with SIMD Vector Instructions (SIMD 벡터 명령어를 이용한 다차원 레코드 스캔)

  • Cho, Sung-Ryong;Han, Hwan-Soo;Lee, Sang-Won
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.6
    • /
    • pp.732-736
    • /
    • 2010
  • Processing a large amount of data becomes more important than ever. Particularly, the information queries which require multi-dimensional record scan can be efficiently implemented with SIMD instruction sets. In this article, we present a SIMD record scan technique which employs row-based scanning. Our technique is different from existing SIMD techniques for predicate processes and aggregate operations. Those techniques apply SIMD instructions to the attributes in the same column of the database, exploiting the column-based record organization of the in-memory database systems. Whereas, our SIMD technique is useful for multi-dimensional record scanning. As the sizes of registers and the memory become larger, our row-based SIMD scan can have bigger impact on the performance. Moreover, since our technique is orthogonal to the parallelization techniques for multi-core processors, it can be applied to both uni-processors and multi-core processors without too many changes in the software architectures.

TBBench: A Micro-Benchmark Suite for Intel Threading Building Blocks

  • Marowka, Ami
    • Journal of Information Processing Systems
    • /
    • v.8 no.2
    • /
    • pp.331-346
    • /
    • 2012
  • Task-based programming is becoming the state-of-the-art method of choice for extracting the desired performance from multi-core chips. It expresses a program in terms of lightweight logical tasks rather than heavyweight threads. Intel Threading Building Blocks (TBB) is a task-based parallel programming paradigm for multi-core processors. The performance gain of this paradigm depends to a great extent on the efficiency of its parallel constructs. The parallel overheads incurred by parallel constructs determine the ability for creating large-scale parallel programs, especially in the case of fine-grain parallelism. This paper presents a study of TBB parallelization overheads. For this purpose, a TBB micro-benchmarks suite called TBBench has been developed. We use TBBench to evaluate the parallelization overheads of TBB on different multi-core machines and different compilers. We report in detail in this paper on the relative overheads and analyze the running results.

Analysis on the Temperature of Multi-core Processors according to Placement of Functional Units and L2 Cache (코어 내부 구성요소와 L2 캐쉬의 배치 관계에 따른 멀티코어 프로세서의 온도 분석)

  • Son, Dong-Oh;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.4
    • /
    • pp.1-8
    • /
    • 2014
  • As cores in multi-core processors are integrated in a single chip, power density increased considerably, resulting in high temperature. For this reason, many research groups have focused on the techniques to solve thermal problems. In general, the approaches using mechanical cooling system or DTM(Dynamic Thermal Management) have been used to reduce the temperature in the microprocessors. However, existing approaches cannot solve thermal problems due to high cost and performance degradation. However, floorplan scheme does not require extra cooling cost and performance degradation. In this paper, we propose the diverse floorplan schemes in order to alleviate the thermal problem caused by the hottest unit in multi-core processors. Simulation results show that the peak temperature can be reduced efficiently when the hottest unit is located near to L2 cache. Compared to baseline floorplan, the peak temperature of core-central and core-edge are decreased by $8.04^{\circ}C$, $8.05^{\circ}C$ on average, respectively.

An Optimal Instruction Fetch Strategy for SMT Processors (SMT 프로세서에 최적화된 명령어 페치 전략에 관한 연구)

  • 홍인표;문병인;김문경;이용석
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.5C
    • /
    • pp.512-521
    • /
    • 2002
  • Recently, conventional superscalar RISC processors arrive their performance limit, and many researches on the next-generation architecture are concentrated on SMT(Simultaneous Multi-Threading). In SMT processors, multiple threads are executed simultaneously and share hardware resources dynamically. In this case, it is more important to supply instructions from multiple threads to processor core efficiently than ever. Because SMT architecture shows higher IPC(Instructions per cycle) than superscalar architecture, performance is influenced by fetch bandwidth and the size of fetch queue. Moreover, to use TLP(Thread Level Parallelism) efficiently, fetch thread selection algorithm and fetch bandwidth for each selected threads must be carefully designed. Thus, in this paper, the performance values influenced by these factors are analyzed. Based on the results, an optimal instruction fetch strategy for SMT processors is proposed.

A Study of Trace-driven Simulation for Multi-core Processor Architectures (멀티코어 프로세서의 명령어 자취형 모의실험에 대한 연구)

  • Lee, Jong-Bok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.3
    • /
    • pp.9-13
    • /
    • 2012
  • In order to overcome the complexity and power problems of superscalar processors, the multi-core architecture has been prevalent recently. Although the execution-driven simulation is wide spread, the trace-driven simulation has speed advantages over the execution-driven simulation. We present a methodology to simulate multi-core architecture using trace-driven simulator. Using SPEC 2000 benchmarks as input, the trace-driven simulation has been performed for the cores ranging from 2 to 16 extensively. As a result, the 16-core processor resulted in 4.1 IPC and 13.3 times speed up over single-core processor on the average.

Analysis on the Thermal Efficiency of Branch Prediction Techniques in 3D Multicore Processors (3차원 구조 멀티코어 프로세서의 분기 예측 기법에 관한 온도 효율성 분석)

  • Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • The KIPS Transactions:PartA
    • /
    • v.19A no.2
    • /
    • pp.77-84
    • /
    • 2012
  • Speculative execution for improving instruction-level parallelism is widely used in high-performance processors. In the speculative execution technique, the most important factor is the accuracy of branch predictor. Unfortunately, complex branch predictors for improving the accuracy can cause serious thermal problems in 3D multicore processors. Thermal problems have negative impact on the processor performance. This paper analyzes two methods to solve the thermal problems in the branch predictor of 3D multi-core processors. First method is dynamic thermal management which turns off the execution of the branch predictor when the temperature of the branch predictor exceeds the threshold. Second method is thermal-aware branch predictor placement policy by considering each layer's temperature in 3D multi-core processors. According to our evaluation, the branch predictor placement policy shows that average temperature is $87.69^{\circ}C$, and average maximum temperature gradient is $11.17^{\circ}C$. And, dynamic thermal management shows that average temperature is $89.64^{\circ}C$ and average maximum temperature gradient is $17.62^{\circ}C$. Proposed branch predictor placement policy has superior thermal efficiency than the dynamic thermal management. In the perspective of performance, the proposed branch predictor placement policy degrades the performance by 3.61%, while the dynamic thermal management degrades the performance by 27.66%.