• Title/Summary/Keyword: Processor Core

Search Result 397, Processing Time 0.024 seconds

Improving Haskell GC-Tuning Time Using Divide-and-Conquer (분할 정복법을 이용한 Haskell GC 조정 시간 개선)

  • An, Hyungjun;Kim, Hwamok;Liu, Xiao;Kim, Yeoneo;Byun, Sugwoo;Woo, Gyun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.9
    • /
    • pp.377-384
    • /
    • 2017
  • The performance improvement of a single core processor has reached its limit since the circuit density cannot be increased any longer due to overheating. Therefore, the multicore and manycore architectures have emerged as viable approaches and parallel programming becomes more important. Haskell, a purely functional language, is getting popular in this situation since it naturally supports parallel programming owing to its beneficial features including the implicit parallelism in evaluating expressions and the monadic tools supporting parallel constructs. However, the performance of Haskell parallel programs is strongly influenced by the performance of the run-time system including the garbage collector. Though a memory profiling tool namely GC-tune has been suggested, we need a more systematic way to use this tool. Since GC-tune finds the optimal memory size by executing the target program with all the different possible GC options, the GC-tuning time takes too long. This paper suggests a basic divide-and-conquer method to reduce the number of GC-tune executions by reducing the search area by one-quarter for every searching step. Applying this method to two parallel programs, a maximally independent set and a K-means programs, the memory tuning time is reduced by 7.78 times with accuracy 98% on average.

Design and Implementation of Efficient Decoder for Fractal-based Compressed Image (효율적 프랙탈 영상 압축 복호기의 설계 및 구현)

  • Kim, Chun-Ho;Kim Lee-Sup
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.12
    • /
    • pp.11-19
    • /
    • 1999
  • Fractal image compression algorithm has been studied mostly not in the view of hardware but software. However, a general processor by software can't decode fractal compressed images in real-time. Therefore, it is necessary that we develop a fast dedicated hardware. However, design examples of dedicated hardware are very rare. In this paper, we designed a quadtree fractal-based compressed image decoder which can decode $256{\times}256$ gray-scale images in real-time and used two power-down methods. The first is a hardware-optimized simple post-processing, whose role is to remove block effect appeared after reconstruction, and which is easier to be implemented in hardware than non-2' exponents weighted average method used in conventional software implementation, lessens costs, and accelerates post-processing speed by about 69%. Therefore, we can expect that the method dissipates low power and low energy. The second is to design a power dissipation in the multiplier can be reduced by about 28% with respect to a general array multiplier which is known efficient for low power design in the size of 8 bits or smaller. Using the above two power-down methods, we designed decoder's core block in 3.3V, 1 poly 3 metal, $0.6{\mu}m$ CMOS technology.

  • PDF

Energy-Efficient Signal Processing Using FPGAs (FPGA 상에서 에너지 효율이 높은 병렬 신호처리 기법)

  • Jang Ju-wook;Hwang Yunil;Scrofano Ronald;Prasanna Viktor K.
    • The KIPS Transactions:PartA
    • /
    • v.12A no.4 s.94
    • /
    • pp.305-312
    • /
    • 2005
  • In this paper, we present algorithm-level techniques for energy-efficient design at the algorithm level using FPGAs. We then use these techniques to create energy-efficient designs for two signal processing kernel applications: fast Fourier transform(FFT) and matrix multiplication. We evaluate the performance, in terms of both latency and energy efficiency, of FPGAs in performing these tasks. Using a Xilinx Virtex-II as the target FPGA, we compare the performance of our designs to those from the Xilinx library as well as to conventional algorithms run on the PowerPC core embedded in the Virtex-II Pro and the Texas Instruments TMS320C6415. Our evaluations are done both through estimation based on energy and latency equations on high-level and through low-level simulation. For FFT, our designs dissipated an average of $50\%$ less energy than the design from the Xilinx library and $56\%$ less than the DSP. Our designs showed an EAT factor of 10 times improvement over the embedded processor. These results provide a concrete evidence to substantiate the idea that FPGAs can outperform DSPs and embedded processors in signal processing. Further, they show that PFGAs can achieve this performance while still dissipating less energy than the other two types of devices.

Adaptive Filter Design for Eliminating Baseline Wandering Noise of Electrocardiogram (심전도 기저선 흔들림 잡음 제거를 위한 적응형 필터 설계)

  • Choi, Chul-Hyung;Rahman, MD Saifur;Kim, Si-Kyung;Park, In-Deok;Kim, Young-Pil
    • The Journal of Korean Institute of Information Technology
    • /
    • v.15 no.12
    • /
    • pp.157-164
    • /
    • 2017
  • Mobile ECG signal measurement is a technique to measure small signals of several mV, and many studies have been conducted to remove noise including wandering scheme. Removal of the equipotential line noise caused by shaking or movement of the electrode cable is one of the core research contents for the electrocardiogram measurement. In this study, we proposed a modified step-size of combined NLMS(normalized least squares) and DLMS(delayed least squares) adaptive filter to eliminate baseline noise from ECG signals. The proposed method mainly adjusts initial filter step-size to reduce distortion of original ECG signals characteristic after eliminating baseline noise. The modified filter step-size is scaled by filter order size and distortion minimization factor. This method is suitable for portable ECG device with a small processor and less power consumption. This technique also decreases computation time which is essential for real-time filtering. The proposed filter also increase the signal to noise ratio (SNR) compared to conventional NLMS filter.

A ScanSAR Processing without Azimuth Stitching by Time-domain Cross-correlation (Azimuth Stitching 없는 ScanSAR 영상화: 시간영역 교차상관)

  • Won, Joong-Sun
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.3
    • /
    • pp.251-263
    • /
    • 2022
  • This paper presents an idea of ScanSAR image formation. For image formation of ScanSAR that utilizes the burst mode for raw signal acquisition, most conventional single burst methods essentially require a step of azimuth stitching which contributes to radiometric and phase distortions to some extent. Time-domain cross correlation could replace SPECAN which is most popularly used for ScanSAR processing. The core idea of the proposed method is that it is possible to relieve the necessity of azimuth stitching by an extension of Doppler bandwidth of the reference function to the burst cycle period. Performance of the proposed method was evaluated by applying it to the raw signals acquired by a spaceborne SAR system, and results satisfied all image quality requirements including 3 dB width, peak-to-sidelobe ratio (PSLR), compression ratio,speckle noise, etc. Image quality of ScanSAR is inferior to that of Stripmap in all aspects. However, it is also possible to improve the quality of ScanSAR image competitive to that of Stripmap if focused on a certain parameter while reduced qualities of other parameters. Thus, it is necessary for a ScanSAR processor to offer a great degree of flexibility complying with different requirements for different applications and techniques.

A Security SoC embedded with ECDSA Hardware Accelerator (ECDSA 하드웨어 가속기가 내장된 보안 SoC)

  • Jeong, Young-Su;Kim, Min-Ju;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.7
    • /
    • pp.1071-1077
    • /
    • 2022
  • A security SoC that can be used to implement elliptic curve cryptography (ECC) based public-key infrastructures was designed. The security SoC has an architecture in which a hardware accelerator for the elliptic curve digital signature algorithm (ECDSA) is interfaced with the Cortex-A53 CPU using the AXI4-Lite bus. The ECDSA hardware accelerator, which consists of a high-performance ECC processor, a SHA3 hash core, a true random number generator (TRNG), a modular multiplier, BRAM, and control FSM, was designed to perform the high-performance computation of ECDSA signature generation and signature verification with minimal CPU control. The security SoC was implemented in the Zynq UltraScale+ MPSoC device to perform hardware-software co-verification, and it was evaluated that the ECDSA signature generation or signature verification can be achieved about 1,000 times per second at a clock frequency of 150 MHz. The ECDSA hardware accelerator was implemented using hardware resources of 74,630 LUTs, 23,356 flip-flops, 32kb BRAM, and 36 DSP blocks.

Development and Field Test of the NEXTSat-2 Synthetic Aperture Radar (SAR) Antenna Onboard Vehicle (차세대소형위성 2호 영상 레이다 안테나 개발 및 차량 탑재 시험)

  • Shin, Goo-Hwan;Lee, Jung-Su;Jang, Tae Seong;Kim, Dong-Guk;Jung, Young-Bae
    • Journal of Space Technology and Applications
    • /
    • v.1 no.1
    • /
    • pp.33-40
    • /
    • 2021
  • Based on the requirements of a total weight of 42 kg or less, the NEXTSat-2 SAR (synthetic aperture radar) system was developed. As the NEXTSat-2 is a small-sized satellite, the SAR system was designed to account for about 40% of the dry mass of the payload relative to the total mass. Among the major components of the SAR system - which are an antenna, an RF transceiver, a baseband signal processor, and a power unit - a part with a particularly large dry mass is the antenna, the core of the SAR system. Whereas various selections are possible in consideration of gain and efficiency when designing the antenna, the micro-strip patch array antenna was adopted by reflecting the dry mass, power, and resolution required by the NEXTSat-2 project. In order to meet the mission requirement of the NEXTSat-2, the antenna was developed with a frequency of 9.65 GHz, a gain of 42.7 dBi, and a return loss of -15 dB. The performance of the antenna was verified by conducting a field test onboard the vehicle.