• Title/Summary/Keyword: Multi-processor

Search Result 576, Processing Time 0.023 seconds

An Implementation of 3D Graphic Accelerator for Phong Shading (퐁 음영법을 위한 3차원 그래픽 가속기의 구현)

  • Lee, Hyung;Park, Youn-Ok;Park, Jong-Won
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.5
    • /
    • pp.526-534
    • /
    • 2000
  • There have been many researches on the 3D graphic accelerator for high speed by needs of CAD/CAM,3D modeling, virtual reality or medical image. In this paper, an SIMD processor architecture for 3D graphic accelerator is proposed in order to improve the processing time of the 3D graphics, and a parallel Phong shading algorithm is presented to estimate performance of the proposed architecture. The proposed SIMD processor architecture for 3D graphic accelerator consists of PCI local bus interface, 16 Processing Elements (PE's), and Park's multi-access memory system (NAMS) that has 17 memory modules. A serial algorithm for Phong shading is modified for the architecture and the main key is to divide a polygon into $4\times{4}$ squares. And, for processing a square, 4 PE's are regarded as a PE Grou logically. Since MAMS can support block access type with interval 1, it is possible that 4 PE Groups process a square at a time. In consequence, 16 pixels are processed simultaneously. The proposed SIMD processor architecture is simulated by CADENCE Verilog-XL that is a package for the hardware simulation. With the same simulated results as that of the serial algorithm, the speed enhancement by the parallel algorithm to the serial one is 5.68.

  • PDF

An FPGA Implementation of the Synthesis Filter for MPEG-1 Audio Layer III by a Distributed Arithmetic Lookup Table (분산산술연산방식을 이용한 MPEG-1 오디오 계층 3 합성필터의 FPGA 군현)

  • Koh Sung-Shik;Choi Hyun-Yong;Kim Jong-Bin;Ku Dae-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.8
    • /
    • pp.554-561
    • /
    • 2004
  • As the technologies of semiconductor and multimedia communication have been improved. the high-quality video and the multi-channel audio have been highlighted. MPEG Audio Layer 3 decoder has been implemented as a Processor using a standard. Since the synthesis filter of MPEG-1 Audio Layer 3 decoder requires the most outstanding operation in the entire decoder. the synthesis filter that can reduce the amount of operation is needed for the design of the high-speed processor. Therefore, in this paper, the synthesis filter. the most important part of MPEG Audio, is materialized in FPGA using the method of DAULT (distributed arithemetic look-up table). For the design of high-speed synthesis filter, the DAULT method is used instead of a multiplier and a Pipeline structure is used. The Performance improvement by 30% is obtained by additionally making the result of multiplication of data with cosine function into the table. All hardware design of this Paper are described using VHDL (VHIC Hardware Description Language) Active-HDL 6.1 of ALDEC is used for VHDL simulation and Synplify Pro 7.2V is used for Model-sim and synthesis. The corresponding library is materialized by XC4013E and XC4020EX. XC4052XL of XILINX and XACT M1.4 is used for P&R tool. The materialized processor operates from 20MHz to 70MHz.

Thermal Pattern Comparison between 2D Multicore Processors and 3D Multicore Processors (2차원 구조와 3차원 구조에 따른 멀티코어 프로세서의 온도 분석)

  • Choi, Hong-Jun;Ahn, Jin-Woo;Jang, Hyung-Beom;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.9
    • /
    • pp.1-10
    • /
    • 2011
  • Unfortunately, in current microprocessors, increasing the frequency causes increased power consumption and reduced reliability whereas it improves the performance. To overcome the power and thermal problems in the singlecore processors, multicore processors has been widely used. For 2D multicore processors, interconnection is regarded as one of the major constraints in performance and power efficiency. To reduce the performance degradation and the power consumption in 2D multicore processors, 3D integrated design technique has been studied by many researchers. Compared to 2D multicore processors, 3D multicore processors get the benefits of performance improvement and reduced power consumption by reducing the wire length significantly. However, 3D multicore processors have serious thermal problems due to high power density, resulting in reliability degradation. Detailed thermal analysis for multicore processors can be useful in designing thermal-aware processors. In this paper, we analyze the impact of workload distribution, distance to the heat sink, and number of stacked dies on the processor temperature. We also analyze the effects of the temperature on overall system performance. Especially, this paper presents the guideline for thermal-aware multicore processor design by analyzing the thermal problems in 2D multicore processors and 3D multicore processors.

Airborne Pulsed Doppler Radar Development (비행체 탑재 펄스 도플러 레이다 시험모델 개발)

  • Kwag, Young-Kil;Choi, Min-Su;Bae, Jae-Hoon;Jeon, In-Pyung;Yang, Ju-Yoel
    • Journal of Advanced Navigation Technology
    • /
    • v.10 no.2
    • /
    • pp.173-180
    • /
    • 2006
  • An airborne radar is an essential aviation electronic system of the aircraft to perform various missions in all weather environments. This paper presents the design, development, and test results of the multi-mode pulsed Doppler radar system test model for helicopter-borne flight test. This radar system consists of 4 LRU units, which include ANTU(Antenna Unit), TRU(Tx Rx Unit), RSDU(Radar Signal & Data Processing Unit) and DISU(Display Unit). The developed technologies include the TACCAR processor, planar array antenna, TWTA transmitter, coherent I/Q detector, digital pulse compression, DSP based Doppler FFT filtering, adaptive CFAR, IMU, and tracking capability. The design performance of the developed radar system is verified through various helicopter-borne field tests including MTD (Moving Target Detector) capability for the Doppler compensation due to the moving platform motion.

  • PDF

A 2-Dimension Torus-based Genetic Algorithm for Multi-disk Data Allocation (2차원 토러스 기반 다중 디스크 데이터 배치 병렬 유전자 알고리즘)

  • 안대영;이상화;송해상
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.2
    • /
    • pp.9-22
    • /
    • 2004
  • This paper presents a parallel genetic algorithm for the Multi-disk data allocation problem an NP-complete problem. This problem is to find a method to distribute a Binary Cartesian Product File on disk-arrays to maximize parallel disk I/O accesses. A Sequential Genetic Algorithm(SGA), DAGA, has been proposed and showed the superiority to the other proposed methods, but it has been observed that DAGA consumes considerably lengthy simulation time. In this paper, a parallel version of DAGA(ParaDAGA) is proposed. The ParaDAGA is a 2-dimension torus-based Parallel Genetic Algorithm(PGA) and it is based on a distributed population structure. The ParaDAGA has been implemented on the parallel computer simulated on a single processor platform. Through the simulation, we study the impact of varying ParaDAGA parameters and compare the quality of solution derived by ParaDAGA and DAGA. Comparing the quality of solutions, ParaDAGA is superior to DAGA in all cases of configurations in less simulation time.

Implementation of a Real-time Multipath Fading Channel Simulator Using a Hybrid DSP-FPGA Architecture (DSP-FPGA 구조를 갖는 다중경로 페이딩 채널 시뮬레이터 구현)

  • 이주현;이찬길
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1
    • /
    • pp.17-23
    • /
    • 2004
  • The mobile radio channel can be simulated as a complex-valued random process with narrow-band spectrum. This paper describes a real-time implementation of that process using a INS320C6414 digital signal processor and XC2VP30 Virtex FPGA. The simulator presented here is not only a comprehensive model of the flat fading but also frequency selective fading mobile channel conditions. To replicate the statistical characteristics of the multipath fading environment with the minimum computational burden, multi-rate techniques are employed to resolve practical problems such as variable sampling rate. The simulator produces accurate and consistent results due to digital implementation. It is very flexible and simple to program for various field conditions in mobile communications with a graphical user interface.

Design of Parallel Processing of Lane Detection System Based on Multi-core Processor (멀티코어를 이용한 차선 검출 병렬화 시스템 설계)

  • Lee, Hyo-Chan;Moon, Dai-Tchul;Park, In-hag;Heo, Kang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.9
    • /
    • pp.1778-1784
    • /
    • 2016
  • we improved the performance by parallelizing lane detection algorithms. Lane detection, as a intellectual assisting system, helps drivers make an alarm sound or revise the handle in response of lane departure. Four kinds of algorithms are implemented in order as following, Gaussian filtering algorithm so as to remove the interferences, gray conversion algorithm to simplify images, sobel edge detection algorithm to find out the regions of lanes, and hough transform algorithm to detect straight lines. Among parallelized methods, the data level parallelism algorithm is easy to design, yet still problem with the bottleneck. The high-speed data level parallelism is suggested to reduce this bottleneck, which resulted in noticeable performance improvement. In the result of applying actual road video of black-box on our parallel algorithm, the measurement, in the case of single-core, is approximately 30 Frames/sec. Furthermore, in the case of octa-core parallelism, the data level performance is approximately 100 Frames/sec and the highest performance comes close to 150 Frames/sec.

Optimizing Skyline Query Processing Algorithms on CUDA Framework (CUDA 프레임워크 상에서 스카이라인 질의처리 알고리즘 최적화)

  • Min, Jun;Han, Hwan-Soo;Lee, Sang-Won
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.275-284
    • /
    • 2010
  • GPUs are stream processors based on multi-cores, which can process large data with a high speed and a large memory bandwidth. Furthermore, GPUs are less expensive than multi-core CPUs. Recently, usage of GPUs in general purpose computing has been wide spread. The CUDA architecture from Nvidia is one of efforts to help developers use GPUs in their application domains. In this paper, we propose techniques to parallelize a skyline algorithm which uses a simple nested loop structure. In order to employ the CUDA programming model, we apply our optimization techniques to make our skyline algorithm fit into the performance restrictions of the CUDA architecture. According to our experimental results, we improve the original skyline algorithm by 80% with our optimization techniques.

Implementation of Active Noise Curtains for Long Distance Noise (원거리 소음 제거를 위한 능동방음막 구현)

  • Nam, Hyun-Do;Kwon Hyuk
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.18 no.1
    • /
    • pp.154-160
    • /
    • 2004
  • In this paper, implementation of active noise curtains using multiple channel adaptive filters is presented. The same numbers of single channel LMS algorithms as control loudspeakers is used instead of a multi-channel LMS algorithm to reduce the computational burden of adaptive filter algorithms. In general, a multi-channel LMS algorithm is usually used in active noise control system. but this algorithm has much more computational complexity. The single channel control techniques have less amount of DSP calculation, compared to multiple channel control techniques. A stabilizing procedure for adaptive IIR filters is also proposed to improve the stability of recursive LMS algorithms. Both experimental results of two control techniques using TMS320VC33 digital signal processor show the similar noise reduction, but the single channel control techniques are more efficient in practical active noise curtain applications

Spaceborne Data Link Design for High Rate Radar Imaging Data Transmission (고속 레이다 영상자료 전송을 위한 위성탑재 데이터 링크 설계)

  • Gwak, Yeong-Gil
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.39 no.3
    • /
    • pp.117-124
    • /
    • 2002
  • A high speed data link capability is one of the critical factors in determining the performance of the spaceborne SAR system with high resolution because of the strict requirement for the real-time data transmission of the massive SAR data in a limited time of mission. In this paper, based on the data link model characterized by the spaceborne small SAR system, the high rate multi-channel data link module is designed including link storage, link processor, transmitter, and wide-angle antenna. The design results are presented with the performance analysis on the data link budget as well as the multi-mode data rate in association with the SAR imaging mode of operation from high resolution to the wide swath. The designed data link module can be effectively used for the spaceborne and airborne applications which requires to expand the high speed data link capability.