• Title/Summary/Keyword: hardware optimization

Search Result 210, Processing Time 0.032 seconds

Radix-2 16 Points FFT Algorithm Accelerator Implementation Using FPGA (FPGA를 사용한 radix-2 16 points FFT 알고리즘 가속기 구현)

  • Gyu Sup Lee;Seong-Min Cho;Seung-Hyun Seo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.1
    • /
    • pp.11-19
    • /
    • 2024
  • The increased utilization of the FFT in signal processing, cryptography, and various other fields has highlighted the importance of optimization. In this paper, we propose the implementation of an accelerator that processes the radix-2 16 points FFT algorithm more rapidly and efficiently than FFT implementation of existing studies, using FPGA(Field Programmable Gate Array) hardware. Leveraging the hardware advantages of FPGA, such as parallel processing and pipelining, we design and implement the FFT logic in the PL (Programmable Logic) part using the Verilog language. We implement the FFT using only the Zynq processor in the PS (Processing System) part, and compare the computation times of the implementation in the PL and PS part. Additionally, we demonstrate the efficiency of our implementation in terms of computation time and resource usage, in comparison with related works.

Grid-Connected Dual Stator-Winding Induction Generator Wind Power System for Wide Wind Speed Ranges

  • Shi, Kai;Xu, Peifeng;Wan, Zengqiang;Bu, Feifei;Fang, Zhiming;Liu, Rongke;Zhao, Dean
    • Journal of Power Electronics
    • /
    • v.16 no.4
    • /
    • pp.1455-1468
    • /
    • 2016
  • This paper presents a grid-connected dual stator-winding induction generator (DWIG) wind power system suitable for wide wind speed ranges. The parallel connection via a unidirectional diode between dc buses of both stator-winding sides is employed in this DWIG system, which can output a high dc voltage over wide wind speed ranges. Grid-connected inverters (GCIs) do not require booster converters; hence, the efficiency of wind energy utilization increases, and the hardware topology and control strategy of GCIs are simplified. In view of the particularities of the parallel topology and the adopted generator control strategy, we propose a novel excitation-capacitor optimization solution to reduce the volume and weight of the static excitation controller. When this excitation-capacitor optimization is carried out, the maximum power tracking problem is also considered. All the problems are resolved with the combined control of the DWIG and GCI. Experimental results on the platform of a 37 kW/600 V prototype show that the proposed DWIG wind power system can output a constant dc voltage over wide rotor speed ranges for grid-connected operations and that the proposed excitation optimization scheme is effective.

Hyperparameter Search for Facies Classification with Bayesian Optimization (베이지안 최적화를 이용한 암상 분류 모델의 하이퍼 파라미터 탐색)

  • Choi, Yonguk;Yoon, Daeung;Choi, Junhwan;Byun, Joongmoo
    • Geophysics and Geophysical Exploration
    • /
    • v.23 no.3
    • /
    • pp.157-167
    • /
    • 2020
  • With the recent advancement of computer hardware and the contribution of open source libraries to facilitate access to artificial intelligence technology, the use of machine learning (ML) and deep learning (DL) technologies in various fields of exploration geophysics has increased. In addition, ML researchers have developed complex algorithms to improve the inference accuracy of various tasks such as image, video, voice, and natural language processing, and now they are expanding their interests into the field of automatic machine learning (AutoML). AutoML can be divided into three areas: feature engineering, architecture search, and hyperparameter search. Among them, this paper focuses on hyperparamter search with Bayesian optimization, and applies it to the problem of facies classification using seismic data and well logs. The effectiveness of the Bayesian optimization technique has been demonstrated using Vincent field data by comparing with the results of the random search technique.

Performance Reengineering of Embedded Real-Time Systems (내장형 실시간 시스템의 성능 개선을 위한 리엔지니어링 기법)

  • 홍성수
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.5_6
    • /
    • pp.299-306
    • /
    • 2003
  • This paper formulates a problem of embedded real-time system re-engineering, and presents its solution approach. Embedded system re-engineering is defined as a development task of meeting performance requirements newly imposed on a system after its hardware and software have been fully implemented. The performance requirements nay include a real-time throughput and an input-to-output latency. The proposed solution approach is based on a bottleneck analysis and nonlinear optimization. The inputs to the approach include a system design specified with a process network and a set of task graphs, task allocation and scheduling, and a new real-time throughput requirement specified as a system's period constraint. The solution approach works in two steps. In the first step, it determines bottleneck precesses in the process network via estimation of process latencies. In the second step, it derives a system of constraints with performance scaling factors of processing elements being variables. It then solves the constraints for the performance staling factors with an objective of minimizing the total hardware cost of the resultant system. These scaling factors suggest the minimal cost hardware upgrade to meet the new performance requirement. Since this approach does not modify carefully designed software structures, it helps reduce the re-engineering cycle.

A Design of Parameterized Viterbi Decoder using Hardware Sharing (하드웨어 공유를 이용한 파라미터화된 비터비 복호기 설계)

  • Park, Sang-Deok;Jeon, Heung-Woo;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.93-96
    • /
    • 2008
  • This paper describes an efficient design of a multi-standard Viterbi decoder that supports multiple constraint lengths and code rates. The Viterbi decode. is parameterized for the code rates 1/2, 1/3 and constraint lengths 7, 9, thus it has four operation modes. In order to achieve low hardware complexity and low power, an efficient architecture based on hardware sharing techniques is devised. Also, the optimization of ACCS (Accumulate-Subtract) circuit for the one-point trace-back algorithm reduces its area by about 35% compared to the full parallel ACCS circuit. The parameterized Viterbi decoder core has 79,818 gates and 25,600 bits memory, and the estimated throughput is about 105 Mbps at 70 MHz clock frequency.

  • PDF

Design of Multiplierless Lifting-based Wavelet Transform using Pattern Search Methods (패턴 탐색 기법을 사용한 Multiplierless 리프팅 기반의 웨이블릿 변환의 설계)

  • Son, Chang-Hoon;Park, Seong-Mo;Kim, Young-Min
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.7
    • /
    • pp.943-949
    • /
    • 2010
  • This paper presents some improvements on VLSI implementation of lifting-based 9/7 wavelet transform by optimization hardware multiplication. The proposed solution requires less logic area and power consumption without performance loss compared to previous wavelet filter structure based on lifting scheme. This paper proposes a better approach to the hardware implementation using Lefevre algorithm based on extensions of Pattern search methods. To compare the proposed structure to the previous solutions on full multiplier blocks, we implemented them using Verilog HDL. For a hardware implementation of the two solutions, the logical synthesis on 0.18 um standard cells technology show that area, maximum delay and power consumption of the proposed architecture can be reduced up to 51%, 43% and 30%, respectively, compared to previous solutions for a 200 MHz target clock frequency. Our evaluation show that when design VLSI chip of lifting-based 9/7 wavelet filter, our solution is better suited for standard-cell application-specific integrated circuits than prior works on complete multiplier blocks.

Design of Low-Complexity 128-Bit AES-CCM* IP for IEEE 802.15.4-Compatible WPAN Devices (IEEE 802.15.4 호환 WPAN 기기를 위한 낮은 복잡도를 갖는128-bit AES-CCM* IP 설계)

  • Choi, Injun;Lee, Jong-Yeol;Kim, Ji-Hoon
    • Journal of IKEEE
    • /
    • v.19 no.1
    • /
    • pp.45-51
    • /
    • 2015
  • Recently, as WPAN (Wireless Personal Area Network) becomes the necessary feature in IoT (Internet of Things) devices, the importance of data security also hugely increases. In this paper, we present the low-complexity 128-bit AES-$CCM^*$ hardware IP for IEEE 802.15.4 standard. For low-cost and low-power implementation which is essentially required in IoT devices, we propose two optimization methods. First, the folded AES(Advanced Encryption Standard) processing core with 8-bit datapath is presented where composite field arithmetic is adopted for reduced hardware complexity. In addition, to support $CCM^*$ mode defined in IEEE 802.15.4, we propose the mode-toggling architecture which requires less hardware resources and processing time. With the proposed methods, the gate count of the proposed AES-$CCM^*$ IP can be lowered up to 57% compared to the conventional architecture.

An HEVC intra encoder sharing DCT with RDO for a low complex hardware (하드웨어 복잡도를 줄이기 위한 RDO내 DCT 공유구조의 HEVC 화면내 예측부호화기)

  • Lee, Sukho;Jang, Juneyoung;Byun, Kyungjun;Eum, Nakwoong
    • Smart Media Journal
    • /
    • v.3 no.4
    • /
    • pp.16-21
    • /
    • 2014
  • HEVC is the latest joint video coding standard with ITU-T SG16 WP and ISO/IEC JTC1/SC29/WG11. Its coding efficiency is about two times compared to H.264 high profile. Intra prediction has 35 directional modes including dc and planer. However an accurate mode decision on lots of modes with SSE is too costly to implement it with hardware. The key idea of this paper is a DCT shared architecture to reduce the complexity of HEVC intra encoder. It is to use same DCT block to quantize as well as to calculate SSE in RDO. The proposed intra encoder uses two step mode decision to lighten complexity with simplified RDO blocks and shares the transform resources. Its BD-rate increase is negligible at 20% on hardware aspect and the operating clock frequency is 300MHz@60fps on FHD ($1920{\times}1080$) image.

A VLSI Design of High Performance H.264 CAVLC Decoder Using Pipeline Stage Optimization (파이프라인 최적화를 통한 고성능 H.264 CAVLC 복호기의 VLSI 설계)

  • Lee, Byung-Yup;Ryoo, Kwang-Ki
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.12
    • /
    • pp.50-57
    • /
    • 2009
  • This paper proposes a VLSI architecture of CAVLC hardware decoder which is a tool eliminating statistical redundancy in H.264/AVC video compression. The previous CAVLC hardware decoder used four stages to decode five code symbols. The previous CAVLC hardware architectures decreased decoding performance because there was an unnecessary idle cycle in between state transitions. Likewise, the computation of valid bit length includes an unnecessary idle cycle. This paper proposes hardware architecture to eliminate the idle cycle efficiently. Two methods are applied to the architecture. One is a method which eliminates an unnecessary things of buffers storing decoded codes and then makes efficient pipeline architecture. The other one is a shifter control to simplify operations and controls in the process of calculating valid bit length. The experimental result shows that the proposed architecture needs only 89 cycle in average for one macroblock decoding. This architecture improves the performance by about 29% than previous designs. The synthesis result shows that the design achieves the maximum operating frequency at 140Mhz and the hardware cost is about 11.5K under a 0.18um CMOS process. Comparing with the previous design, it can achieve low-power operation because this design is implemented with high throughputs and low gate count.

Feasibility Study on Robust Calibration by DoE to Minimize the Exhaust Emission Deviations from Injector Flow Rate Scatters (DoE를 이용한 인젝터 유량 편차에 의한 배출가스 편차에 대한 강건 엔진 매핑 가능성의 검토)

  • Chang, Jin-Seok;Cheong, Jae-Hoon;Jo, Chung-Hoon
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.16 no.1
    • /
    • pp.134-143
    • /
    • 2008
  • The hardware scatters as well as the engine parameters calibration have strong influences on exhaust emissions in recent diesel engines. In this research DoE(Design of Experiments) optimizations were done to study the possibility of minimizing the emission deviations caused by flow rate scatters of the injectors. It has been shown that the optimization of engine calibration, which minimizes the emission deviations, is feasible by establishing a target function representing the emission deviations for test results of maximum, mean and minimum flow rate injectors. It has also been shown that optimization of both emission deviations and emission level is possible by sequential DoE optimizations of the target functions representing the emission level and the emission deviations respectively with the appropriate boundary limits.