• Title/Summary/Keyword: many-core processor

Search Result 54, Processing Time 0.023 seconds

Synthesis of Ocean Wave Models and Simulation Using GPU (바다물결 모형의 합성 및 GPU를 이용한 시뮬레이션)

  • Lee, Dong-Min;Lee, Sung-Kee
    • The KIPS Transactions:PartA
    • /
    • v.14A no.7
    • /
    • pp.421-434
    • /
    • 2007
  • Among many other CG generated natural scenes, the representation of ocean surfaces is one of the most complicated and time-consuming problem because of its large extent and complex surface movement. We present a hybrid method to represent and animate unbound deep-water ocean surfaces by utilizing graphics processor as both simulation and rendering core. Our technique is mainly based on spectral approaches that generate a high-detailed height field using Fourier transform on a 2D regular grid. Additionally, we incorporate Gerstner model and generate low-detailed height field on a 2D projected grid in order to represent large waves and main structure of ocean surface. There is no interruption between CPU and GPU, and no need to transfer simulation results from the system memory to graphics hardware because the entire simulation and rending processes are done on graphics processor. As a result we can synthesize and render realistic water surfaces in real-time. Proposed techniques are readily adoptable to real-time applications such as computer games that have heavy work load on CPU but still demand plausible natural scenes.

A Performance Evaluation of a RISC-Based Digital Signal Processor Architecture (RISC 기반 DSP 프로세서 아키텍쳐의 성능 평가)

  • Kang, Ji-Yang;Lee, Jong-Bok;Sung, Won-Yong
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.2
    • /
    • pp.1-13
    • /
    • 1999
  • As the complexity of DSP (Digital Signal Processing) applications increases, the need for new architectures supporting efficient high-level language compilers also grows. By combining several DSP processor specific features, such as single cycle MAC (Multiply-and-ACcumulate), direct memory access, automatic address generation, and hardware looping, with a RISC core having many general purpose registers and orthogonal instructions, a high-performance and compiler-friendly RISC-based DSP processors can be designed. In this study, we develop a code-converter that can exploit these DSP architectural features by post-processing compiler-generated assembly code, and evaluate the performance effects of each feature using seven DSP-kernel benchmarks and a QCELP vocoder program. Finally, we also compare the performances with several existing DSP processors, such as TMS320C3x, TMS320C54x, and TMS320C5x.

  • PDF

AC-DC Converter Control for Power Factor Correction of Inverter Air Conditioner System (인버터 에어컨 시스템의 역률보상을 위한 AC-DC 컨버터 제어)

  • Park, Gwi-Geun;Choi, Jae-Weon
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.2
    • /
    • pp.154-162
    • /
    • 2007
  • In this paper, we propose a new AC-DC converter control method to comply with harmonics regulation(IEC 61000-3) effective for the inverter system of an air conditioner whose power consumption is less than 2,500W. There are many different ways of AC-DC converter control, but this paper focuses on the converter control method that is adopting an input reactor with low cost silicon steel core to strengthen cost competitiveness of the manufacturer. The proposed control method controls input current every half cycle of the line frequency to get unit power factor and at the same time to reduce switching loss of devices and acoustic noise from reactor. This kind of converter is known as a Partial Switching Converter(PSC). In this study, theoretical analysis of the PSC has been performed using Matlab/Simulink while a 16-bit micro-processor based converter has been used to perform the experimental analysis. In the theoretical analysis, electrical circuit models and equations of the PSC are derived and simulated. In the experiments, micro-processor controls input current to keep the power factor above 0.95 by reducing the phase difference between input voltage and current and at the same time to maintain a reference DC-link voltage against voltage drop which depends on DC-link load. Therefore it becomes possible to comply with harmonic regulations while the power factor is maximized by optimizing the time of current flow through the input reactor for every half cycle of line frequency.

A Study on Cycle Based Simulator of a 32 bit floating point DSP (32비트 부동소수점 DSP의 Cycle Based Simulator에 관한 연구)

  • 우종식;양해용;안철홍;박주성
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.11
    • /
    • pp.31-38
    • /
    • 1998
  • This paper deals with CBS(Cycle Base Simulator) design of a 32 bit floating point DSP(Digital Signal Processor). The CBS has been developed for TMS320C30 compatible DSP and will be used to confirm the architecture, functions of sub-blocks, and control signals of the chip before the detailed logic design starts with VHDL. The outputs from CBS are used as important references at gate level design step because they give us control signals, output values of important blocks, values from internal buses and registers at each pipeline step, which are not available from the commercial simulator of DSP. In addition to core functions, it has various interfaces for efficient execution and convenient result display, CBS is verified through comparison with results from the commercial simulator for many application algorithms and its simulation speed is as fast as several tenth of that of logic simulation with VHDL. CBS in this work is for a specific DSP, but the concept may be applicable to other VLSI design.

  • PDF

Analysis on the Temperature of Multi-core Processors according to Placement of Functional Units and L2 Cache (코어 내부 구성요소와 L2 캐쉬의 배치 관계에 따른 멀티코어 프로세서의 온도 분석)

  • Son, Dong-Oh;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.4
    • /
    • pp.1-8
    • /
    • 2014
  • As cores in multi-core processors are integrated in a single chip, power density increased considerably, resulting in high temperature. For this reason, many research groups have focused on the techniques to solve thermal problems. In general, the approaches using mechanical cooling system or DTM(Dynamic Thermal Management) have been used to reduce the temperature in the microprocessors. However, existing approaches cannot solve thermal problems due to high cost and performance degradation. However, floorplan scheme does not require extra cooling cost and performance degradation. In this paper, we propose the diverse floorplan schemes in order to alleviate the thermal problem caused by the hottest unit in multi-core processors. Simulation results show that the peak temperature can be reduced efficiently when the hottest unit is located near to L2 cache. Compared to baseline floorplan, the peak temperature of core-central and core-edge are decreased by $8.04^{\circ}C$, $8.05^{\circ}C$ on average, respectively.

An Optimal Instruction Fetch Strategy for SMT Processors (SMT 프로세서에 최적화된 명령어 페치 전략에 관한 연구)

  • 홍인표;문병인;김문경;이용석
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.5C
    • /
    • pp.512-521
    • /
    • 2002
  • Recently, conventional superscalar RISC processors arrive their performance limit, and many researches on the next-generation architecture are concentrated on SMT(Simultaneous Multi-Threading). In SMT processors, multiple threads are executed simultaneously and share hardware resources dynamically. In this case, it is more important to supply instructions from multiple threads to processor core efficiently than ever. Because SMT architecture shows higher IPC(Instructions per cycle) than superscalar architecture, performance is influenced by fetch bandwidth and the size of fetch queue. Moreover, to use TLP(Thread Level Parallelism) efficiently, fetch thread selection algorithm and fetch bandwidth for each selected threads must be carefully designed. Thus, in this paper, the performance values influenced by these factors are analyzed. Based on the results, an optimal instruction fetch strategy for SMT processors is proposed.

The study on the Efficient methodology to apply the GPU for military information system improvement (국방정보시스템 성능향상을 위한 효율적인 GPU적용방안 연구)

  • Kauh, Janghyuk;Lee, Dongho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.11 no.1
    • /
    • pp.27-35
    • /
    • 2015
  • Increasing the number of GPU (Graphic Processor Unit) cores, the studies on High Performance Computing Platform using GPU have actively been made in recent. This trend has led to the development of GPGPU (General Purpose GPU) and CUDA (Compute Unified Device Architecture) Framework. In this paper, we explain the many benefits of the GPU based system, and propose the ICIDF(Identify Compute-Intensive Data set and Function) methodology to apply GPU technology to legacy military information system for performance improvement. To demonstrate the efficiency of this methodology, we applied this method to AES CPU based program obtained from the Internet web site. Simply changing the data structure made improved the performance of AES program. As a result, the performance of AES based GPU program is improved gradually up to 10 times. Depending on the developer's ability, additional performance improvement can be expected. The problem to be solved is heat issue, but this problem has been much improved by the development of the cooling technology.

Evaluation of Alignment Methods for Genomic Analysis in HPC Environment (HPC 환경의 대용량 유전체 분석을 위한 염기서열정렬 성능평가)

  • Lim, Myungeun;Jung, Ho-Youl;Kim, Minho;Choi, Jae-Hun;Park, Soojun;Choi, Wan;Lee, Kyu-Chul
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.2
    • /
    • pp.107-112
    • /
    • 2013
  • With the progress of NGS technologies, large genome data have been exploded recently. To analyze such data effectively, the assistance of HPC technique is necessary. In this paper, we organized a genome analysis pipeline to call SNP from NGS data. To organize the pipeline efficiently under HPC environment, we analyzed the CPU utilization pattern of each pipeline steps. We found that sequence alignment is computing centric and suitable for parallelization. We also analyzed the performance of parallel open source alignment tools and found that alignment method utilizing many-core processor can improve the performance of genome analysis pipeline.

Thermal Pattern Comparison between 2D Multicore Processors and 3D Multicore Processors (2차원 구조와 3차원 구조에 따른 멀티코어 프로세서의 온도 분석)

  • Choi, Hong-Jun;Ahn, Jin-Woo;Jang, Hyung-Beom;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.9
    • /
    • pp.1-10
    • /
    • 2011
  • Unfortunately, in current microprocessors, increasing the frequency causes increased power consumption and reduced reliability whereas it improves the performance. To overcome the power and thermal problems in the singlecore processors, multicore processors has been widely used. For 2D multicore processors, interconnection is regarded as one of the major constraints in performance and power efficiency. To reduce the performance degradation and the power consumption in 2D multicore processors, 3D integrated design technique has been studied by many researchers. Compared to 2D multicore processors, 3D multicore processors get the benefits of performance improvement and reduced power consumption by reducing the wire length significantly. However, 3D multicore processors have serious thermal problems due to high power density, resulting in reliability degradation. Detailed thermal analysis for multicore processors can be useful in designing thermal-aware processors. In this paper, we analyze the impact of workload distribution, distance to the heat sink, and number of stacked dies on the processor temperature. We also analyze the effects of the temperature on overall system performance. Especially, this paper presents the guideline for thermal-aware multicore processor design by analyzing the thermal problems in 2D multicore processors and 3D multicore processors.

An Enhanced Overload Control Mechanism for the Distributed Switching System supporting Various Types of Call Services (다양한 호 서비스를 고려한 분산형 중계교환기의 과부하 제어 기법)

  • Lee Jong-Hyup
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.4
    • /
    • pp.744-751
    • /
    • 2006
  • Even many kinds of Internet-based services have been generated due to the great development of the Internet. PSTN still exists in the center of the national infrastructure network and the transit exchanges will be maintained the core roles in PSTN for many years in the future. These transit exchanges often suffer from unexpected overload situation because they have to process Intelligent Network calls and mobile calls additionally as well as POTS (Plain Old Telephone Service). In this paper, we suggest an efficient overload control algorithm for the distributed transit switching system under the various types of call services. We also show that this algorithm can be implemented easily cooperating with the network management control functions. The simulation technique is used to show that the proposed algorithm effectively controls calls and maintains safely the call processor's load under the various kinds of overload situations.