Search | Korea Science

Design of a Booth's Multiplier Suitable for Embedded Systems (임베디드 시스템에 적용이 용이한 Booth 알고리즘 방식의 곱셈기 설계)

Moon, San-Gook
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2007.10a
- /
- pp.838-841
- /
- 2007
In this study, we implemented a $17^*17b$ binary digital multiplier using radix-4 Booth's algorithm. Two stage pipeline architecture was applied to achieve higher throughput and 4:2 adders were used for regular layout structure in the Wallace tree partition. To evaluate the circuit, several MPW chips were fabricated using Hynix 0.6-um 3M N-well CMOS technology. Also we proposed an efficient test methodology and did fault simulations. The chip contains 9115 transistors and the core area occupies about $1135^*1545$ mm2. The functional tests using ATS-2 tester showed that it can operate with 24 MHz clock at 5.0 V at room temperature.
PDF

Image Resolution Reduction Algorithm of Arbitrary Rate and Its Hardware Architecture (임의의 비율을 지원하는 영상 축소 알고리즘과 하드웨어 구조)

Park, Hyun-Sang
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.10 no.11
- /
- pp.3094-3097
- /
- 2009
The use of general-purpose divider is inevitable to implement a image down-scaler when an arbitrary scaling ratio is given. To get an output at every clock from the divider, the divider should be implemented by LUT, however, its hardware size will be bigger and bigger as the precision level is increased. In this paper, a new image scaling algorithm is presented for a arbitrary scaling ratio, which do not requires a general-purpose or LUT-based divider. The proposed algorithm utilizes only comparators and adders such that the hardware size can be reduced by 1/10 compared to the conventional approaches.
https://doi.org/10.5762/KAIS.2009.10.11.3094 인용 PDF

A high-speed complex multiplier based on redundant binary arithmetic (Redundant binary 연산을 이용한 고속 복소수 승산기)

신경욱
- Journal of the Korean Institute of Telematics and Electronics C
- /
- v.34C no.2
- /
- pp.29-37
- /
- 1997
A new algorithm and parallel architecture for high-speed complex number multiplication is presented, and a prototype chip based on the proposed approach is designed. By employing redundant binary (RB) arithmetic, an N-bit complex number multiplication is simplified to two RB multiplications (i.e., an addition of N RB partial products), which are responsible for real and imaginary parts, respectively. Also, and efficient RB encoding scheme proposed in this paper enables to generate RB partial products without additional hardware and delay overheads compared with binary partial product generation. The proposed approach leads to a highly parallel architecture with regularity and modularity. As a results, it results in much simpler realization and higher performance than the classical method based on real multipliers and adders. As a test vehicle, a prototype 8-b complex number multiplier core has been fabricated using $0.8\mu\textrm{m}$ CMOS technology. It contains 11,500 transistors on the area of about $1.05 \times 1.34 textrm{mm}^2$. The functional and speed test results show that it can safely operate with 200 MHz clock at $V_{DD}=2.5 V$, and consumes about 90mW.
PDF

Low-area Pipeline FFT Structure in OFDM System Using Common Sub-expression Sharing and CORDIC (Common sub-expression sharing과 CORDIC을 이용한 OFDM 시스템의 저면적 파이프라인 FFT 구조)

Choi, Dong-Kyu;Jang, Young-Beom
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.46 no.4
- /
- pp.157-164
- /
- 2009
An efficient pipeline MDC Radix-4 FFT structure is proposed in this paper. Every stages in pipeline FFT structure consists of delay' commutator and butterfly. Proposed butterflies in front and rear stages utilize CORDIC and Common Sub-expression Sharing(CSS) techniques, respectively. It is shown that proposed butterfly structure can reduce the number of adders through sharing common patterns of CSD type coefficients. The Verilog-HDL modeling and Synopsys logic synthesis results that the proposed structure show 48.2% cell area reduction in the complex multiplication part and 22.1% cell area reduction in overall 256-point FFT structure comparison with those of the conventional structures. Consequently, the proposed FFT structure can be efficiently used in various OFDM systems.
PDF KSCI

Efficient Operator Design Using Variable Groups (변수그룹을 이용한 효율적인 연산기 설계)

Kim, Yong-Eun;Chung, Jin-Gyun
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.45 no.1
- /
- pp.37-42
- /
- 2008
In this paper, we propose a partial product addition method using variable groups in the design of operators such as multipliers and digital filters. By this method, full adders can be replaced with simple logic circuits. To show the efficiency of the proposed method, we applied the method to the design of squarers and precomputer blocks of FIR filters. In case of 7 bit and 8 bit squarers, it is shown that by the proposed method, area, power and delay time can be reduced up to {22.1%, 20.1%, 14%} and {24.7%, 24.4%, 6.7%}, respectively, compared with the conventional method. The proposed FIR precomputer circuit leads to up to {63.6%, 34.4%, 9.8%} reduction in area, power consumption and propagation delay compared with previous method.
PDF KSCI

Area and Power Efficient VLSI Architecture for Two Dimensional 16-point Modified Gate Diffusion Input Discrete Cosine Transform

Thiruveni, M.;Shanthi, D.
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.16 no.4
- /
- pp.497-505
- /
- 2016
The two-dimensional (2D) Discrete Cosine Transform (DCT) is used widely in image and video processing systems. The perception of human visualization permits us to design approximate rather than exact DCT. In this paper, we propose a digital implementation of 16-point approximate 2D DCT architecture based on one-dimensional (1D) DCT and Modified Gate Diffusion Input (MGDI) technique. The 8-point 1D Approximate DCT architecture requires only 12 additions for realization in digital VLSI. Additions can be performed using the proposed 8 transistor (8T) MGDI Full Adder which reduces 2 transistors than the existing 10 transistor (10T) MGDI Full Adder. The Approximate MGDI 2D DCT using 8T MGDI Full adders is simulated in Tanner SPICE for $0.18{\mu}m$ CMOS process technology at 100MHZ.The simulation result shows that 13.9% of area and 15.08 % of power is reduced in the 8-point approximate 2D DCT, 10.63 % of area and 15.48% of power is reduced in case of 16-point approximate 2D DCT using 8 Transistor MGDI Full Adder than 10 Transistor MGDI Full Adder. The proposed architecture enhances results in terms of hardware complexity, regularity and modularity with a little compromise in accuracy.
https://doi.org/10.5573/JSTS.2016.16.4.497 인용 PDF KSCI

A Low-Error Truncated Booth Multiplier (작은 오차를 갖는 절사형 Booth 승산기)

정해현;박종화;신경욱
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2001.10a
- /
- pp.617-620
- /
- 2001
This paper describes an efficient error-compensation technique for designing a low-error truncated Booth multiplier that receives two N-bit numbers and produces an N-bit product by eliminating the N least-significant bits. Applying the proposed method, a truncated Booth multiplier for area-efficient and low-power applications has been designed, and its performance (truncation error, area) was analyzed. Since the truncated Booth multiplier omits about half the partial product generators and adders, it has an area reduction by about 35%~40%, compared with non-truncated parallel multipliers. Error analysis shows that the proposed approach reduces the average truncation error by approximately 30%~40%, compared with conventional methods.
PDF

Design of Low-Area HEVC Core Transform Architecture (저면적 HEVC 코어 변환기 아키텍쳐 설계)

Han, Seung-Mok;Nam, Woo-Jin;Lee, Seongsoo
- Journal of IKEEE
- /
- v.17 no.2
- /
- pp.119-128
- /
- 2013
This paper proposes and implements an core transform architecture, which is one of the major processes in HEVC video compression standard. The proposed core transform architecture is implemented with only adders and shifters instead of area-consuming multipliers. Shifters in the proposed core transform architecture are implemented in wires and multiplexers, which significantly reduces chip area. Also, it can process from $4{\times}4$ to $16{\times}16$ blocks with common hardware by reusing processing elements. Designed core transform architecture in 0.13um technology can process a $16{\times}16$ block with 2-D transform in 130 cycles, and its gate count is 101,015 gates.
https://doi.org/10.7471/ikeee.2013.17.2.119 인용 PDF KSCI

A low-power systolic structure for MP3 IMDCT Using addition and shift operation (덧셈과 쉬프트 연산을 사용한 MP3 IMDCT의 저전력 Systolic 구조)

Jang Young Beom;Lee Won Sang
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.29 no.10C
- /
- pp.1451-1459
- /
- 2004
In this paper, a low-power 32-point IMDCT structure is proposed for MP3. Through re-odering of IMDCT matrices, we propose the systolic structure operating with 16, 8, 4, 2, and 1 cycle, respectively. To reduce power consumption, multiplication of each sub blocks are implemented by add and shift operation with CSD(Canrmic sigled digit) form coefficients. To reduce, furthermore, the number of adders, we utilize the common sub-expression sharing techniques. With these techniques, the relative power consumption of the proposed structure is reduced by 58.4% comparison to the conventional structure using only 2's complement form coefficient. Validity of the proposed structure is proved through Verilog-HDL coding.
PDF KSCI

Optimization Between Design Blocks using Carry-Save-Adders in VLSI Design (VLSI 설계에서 캐리-세이브 가산기를 이용한 설계 블록들 간의 최적화)

Kim, Tae-Hwan;Eom, Jun-Hyeong
- Journal of KIISE:Computer Systems and Theory
- /
- v.26 no.5
- /
- pp.620-626
- /
- 1999
캐리-세이브 가산기는 (CSA)는 실제 산업체에서 회로를 설계할 때 연산수식의 계산을 빠르게 처리하기위해 가장 많이 사용되는 구성요소들 가운데 하나이다. [3]의 자료에 의하면 실제 회로 설계에서 나오는 전형적인 연산식에 CSA를 이용했을 때 그렇지 않은 경우보다 최대 54%의 연산처리속도와 42%의 회로 면적 향상을 갖는다고 보고하고 있다. 그러나, 이는 그 연산식이 하나의 설계 블록(sub-design)에 포함되어 있다는 전제하에 도출된 것이다. 회로 설계 규모와 복잡도가 큰 응용이 많아지는 상황에서 설계 블록단위의 계층적 설계는 필수적인 추세이므로, CSA를 이용한 회로 최적화를 실현하기위해서는 설계 블록들간에 걸쳐있는 연산식에 대한 CSA 최적화 또한 매우 중요한 문제이다. 이를 해결하기위해서 이 논문에서는 auxiliary port라는 개념을 이용하여 설계 블록들간의 연산식에 대한 CSA 최적화 방법을 제안한다. 실제 실험에서 우리가 제안한 기법은 회로의 전체적인 영역에 걸쳐 CSA를 적용하는 데 매우 효과적이었으며, 이 기법을 적용하지 않고 얻은 CSA 최적화 회로와 비교했을 때 회로에서의 연산식 계산속도와 그 회로 면적이 상당히 향상되었음을 확인하였다.

Search Result 129, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)