Search | Korea Science

A High-Speed SIMD MAC Unit (고속 SIMD형 곱셈 누산기)

조민석;오형철
- Proceedings of the Korean Information Science Society Conference
- /
- 2004.10a
- /
- pp.694-696
- /
- 2004
본 논문에서는 32$\times$32비트 곱셈 연산의 하위 32비트 결과를 한 클록 주기에 얻기 위한, 130MHz 파이프라인용 SIMD형 2단 곱셈 누산기를 설계하였다. 이 과정에서, Booth 부호기의 부분곱의 생성에 소요되는 지연을 줄이면서 부호가 있는 수의 연산을 수행할 수 있는 Booth 부호기를 설계하였다. 생성된 부분곱을 SIMD 명령어에 따라 크기가 선택된 Wallace Tree로 합산하고, 32$\times$32비트 곱셈 연산의 하위 32비트 결과를 제외한 모든 결과들은 두 번째 파이프라인 단에서 얻어지도록 하였다 현재 설계된 SIMD형 곱셈 누산기는 삼성 0.18$\mu\textrm{m}$ 표준 셀로 합성할 때, 1.65V, +1$25^{\circ}C$에서 약 7.61㎱의 임계 경로 지연을 갖는다
PDF

Implementation of Digital Filters on Pipelined Processor with Multiple Accumulators and Internal Datapaths

Hong, Chun-Pyo
- Journal of Korea Society of Industrial Information Systems
- /
- v.4 no.2
- /
- pp.44-50
- /
- 1999
This paper presents a set of techniques to automatically find rate optimal or near rate optimal implementation of shift-invariant flow graphs on pipelined processor, in which pipeline processor has multiple accumulators and internal datapaths. In such case, the problem to be addressed is the scheduling of multiple instruction streams which control all of the pipeline stages. The goal of an automatic scheduler in this context is to rearrange the order of instructions such that they are executed with minimum iteration period between successive iteration of defining flow graphs. The scheduling algorithm described in this paper also focuses on the problem of removing the hazards due to inter-instruction dependencies.
PDF

Implementation and Performance Test of DDFS Modulator using the Initial Clock Accumulating Method (클록초기치 누적방식을 사용한 DDFS 변조기 구현과 성능평가)

최승덕;김경태
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.8
- /
- pp.103-109
- /
- 1998
디지털신호의 변조에는 기본적으로 진폭 편이 변조(ASK: Amplitude-Shift Keying), 주파수 편이 변조(FSK: Frequency-Shift Keying), 위상 편이 변조(PSK: Phase-Shift Keying) 등의 세 가지 방법이 있다. 본 논문에서는 표본클록 합성계수 방식에 관한 이론을 고찰하고, 클록초기치 누적방식의 DDFS를 이용하여 위에서 언급한 변조방법을 실현할 수 있는 주파수 도약 대역 확산 통신에 적합한 변조기를 구현하였다. 또한, 합성된 출력주파수 의 정현파형에 대한 스펙트럼 분석과 PN(Pseudo Noise) 부호를 사용한 순시적인 주파수 도 약 상태, 위상제어의 가능성 등을 확인한 결과 실험으로부터 다음과 같은 결과를 얻었다. 첫 째, 합성된 출력주파수는 주파수 Index에 따라 기준주파수에 정확히 정수배가 되며, 둘째, 합성된 정현파형의 스펙트럼으로 기본파와 여러 고조파의 크기를 비교하여 본 결과 50[dB] 이상의 차이가 남으로서 고조파 성분들이 상당히 감소되었음을 확인하였고, 셋째, PN 코드 를 사용하여 순시적인 주파수 도약 상태를 확인하여 본 결과 스위칭 시간이 빠르기 때문에 주파수 도약 특성이 뛰어남을 알 수 있었으며 또한, 누산기의 set/reset 상태를 변화시킴에 따라 위상이 제어됨을 입증하였다.
PDF

Design of Exp-Golomb CODEC for H.264/AVC Applications (H.264/AVC응용을 위한 Exp-Golomb CODEC의 설계)

Kim, Won-Sam;Sonh, Seung-Il
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2007.06a
- /
- pp.510-513
- /
- 2007
가변길이 부호는 많은 이미지 및 영상 표준에서 폭넓게 사용되는 기법이다. 특히 국제 표준인 JVT와 중국 A/V 표준인 AVS는 엔트로피 코딩을 수행하기 위해 Exp-Golomb 코드에 기반한 UVLC(Universal Variable Length Code)를 채용하고 있다. 본 논문에서는 H.264/AVC의 엔트로피 코딩에서 사용되는 Exp-Golomb CODEC의 하드웨어 구현에 대해 연구하였다. 식의 간략화로 구현하기 어려운 log함수와 거듭제곱 연산을 하지 않으며, 첫 번째 1 검출기와 누산기 제어에 의한 배럴 쉬프터를 통하여 별도의 시간 지연 없이 부호화 및 복호화가 되도록 설계하였다. Xilinx ISE툴을 사용하여 합성하고, 보드 수준에서 PCI인터페이스를 사용하여 검증하였다. 본 논문에서 설계된 Exp-Glomb CODEC은 H.264/AVC 및 AVS와 같은 분야에서 응용이 가능할 것으로 예견된다.
PDF

A Matched Filter with Two Data Flow Paths for Searching Sychronization in DSSS (DSSS 동기탐색을 위한 이중 데이터 흐름 경로를 갖는 정합필터)

Song Myong-Lyol
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.29 no.1A
- /
- pp.99-106
- /
- 2004
In this Paper, the matched filter for searching initial synchronization in DSSS (direct sequence spread spectrum) receiver is studied. The matched filter with a single data flow path is described which can be presented by HDL (Hardware Description Language). In order to improve the processing time of operations for the filter, equations are arranged to represent two data flow paths and the associated hardware model is proposed. The model has an architecture based on parallelism and pipeline for fast processing, in which two data flow paths with a series of memory, multiplier and accumulator are placed in parallel. The performance of the model is analyzed and compared with the matched filter with a single data flow path.
PDF KSCI

Performance Analysis of Modulator using Direct Digital Frequency Synthesizer of Initial Clock Accumulating Method (클록 초기치 누적방식의 직접 디지털 주파수 합성기를 이용한 변조기의 성능해석)

최승덕;김경태
- Journal of the Korean Institute of Telematics and Electronics T
- /
- v.35T no.3
- /
- pp.128-133
- /
- 1998
This paper is study on performance analysis of modulator using direct digital frequency synthesizer of Initial Clock Accumulating Method. It has been generally used for PLL or digital frequency synthesizing method to be synthesizd randomly chosen frequency state. In order to improve disadvantage of two methods, we constructed modulator system using DDFS of Initial Clock Accumulating Method. We also confirmed the coherence frequency hopping state and possibility of phase control. The results obtained from the experiments are as follows; First, the synthesized output frequency is proportional to the sampling frequency, according to index, K. Second, the difference of the gain between the basic frequency and the harmonic frequencies was more than 50 [dB], that is, this means facts that is reduced the harmonic frequency factor. Third, coherence frequency hopping state is confirmed by PN code sequence. Here, we confirmed the proposed method cut switching time, this verify facts that is the best characteristic of the frequency hopping. We also verified the fact that the phase varies as the adder is operated set or reset.
PDF

An Efficient Integer Division Algorithm for High Speed FPGA (고속 FPGA 구현에 적합한 효율적인 정수 나눗셈 알고리즘)

Hong, Seung-Mo;Kim, Chong-Hoon
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.44 no.2
- /
- pp.62-68
- /
- 2007
This paper proposes an efficient integer division algorithm for high speed FPGAs' which support built-in RAMs' and multipliers. The integer division algorithm is iterative with RAM-based LUT and multipliers, which minimizes the usage of logic fabric and connection resources. Compared with some popular division algorithms such as division by subtraction or division by multiply-subtraction, the number of iteration is much smaller, so that very low latency can be achieved with pipelined implementations. We have implemented our algorithm in the Xilinx virtex-4 FPGA with VHDL coding and have achieved 300MSPS data rate in 17bit integer division. The algorithm used less than 1/6 of logic slices, 1/4 of the built-in multiply-accumulation units, and 1/3 of the latencies compared with other popular algorithms.
PDF KSCI

Parallel Distributed Implementation of GHT on MPI-based PC Cluster (MPI 기반 PC 클러스터에서 GHT의 병렬 분산 구현)

Kim, Yeong-Soo;Kim, Jeong-Sahm;Choi, Heung-Moon
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.44 no.3
- /
- pp.81-89
- /
- 2007
This paper presents a parallel distributed implementation of the GHT (generalized Hough transform) for the fast processing on the MPI-based PC cluster. We tried to achieve the higher speedup mainly by alleviating the communication overhead through the pipelined broadcast and accumulator array partition strategy and by time overlapping of the communication and the computation over entire process. Experimental results show that nearly linear speedup is reachable by the proposed method on the MPI-based PC clusters connected through 100Mbps Ethernet switch.
PDF KSCI

Face Detection Using Fusion of Heterogeneous Template Matching (이질적 템플릿 매칭의 융합을 이용한 얼굴 영역 검출)

Lee, Kyoung-Mi
- The Journal of the Korea Contents Association
- /
- v.7 no.12
- /
- pp.311-321
- /
- 2007
For fast and robust face detection, this paper proposes an approach for face detection using fusion of heterogeneous template matching. First, we detect skin regions using a model of skin color which covers various illumination and races. After reducing a search space by region labelling and filtering, we apply template matching with skin color and edge to the detected regions. Finally, we detect a face by finding the best choice of template fusion. Experimental results show the proposed approach is more robust in skin color-like environments than with a single template matching and is fast by reducing a search space to face candidate regions. Also, using a global accumulator can reduce excessive space requirements of template matching.
https://doi.org/10.5392/JKCA.2007.7.12.311 인용 PDF

Development and Evaluation of a Hybrid Damper for Semi-active Suspension (반능동 현가장치의 하이브리드형 댐퍼 개발에 관한 연구)

Jin, Chul Ho;Yoon, Young Won;Lee, Jae Hak
- Journal of Drive and Control
- /
- v.15 no.1
- /
- pp.38-49
- /
- 2018
This research describes the development model and testing of a hybrid damper which can be applicable to a vehicle suspension. The hybrid damper is devised to improve the performance of a conventional passive oil damper using a magneto-rheological (MR) accumulator which consists of a gas accumulator and a MR device. The level of damping is continuously variable by the means of control in the applied current in a MR device fitted to a floating piston which separates the gas and the oil chamber. A simple MR device is used to resist the movement of floating piston. At first a mathematical model which describes all flows within the conventional oil damper is formulated, and then a small MR device is also devised and adopted to a mathematical model to characterize the performance of the device.
https://doi.org/10.7839/ksfc.2018.15.1.038 인용 PDF KSCI

Search Result 17, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)