• Title/Summary/Keyword: 쉬프트연산

Search Result 44, Processing Time 0.029 seconds

A study on the design of a 32-bit ALU (32비트 ALU 설계에 대한 연구)

  • 황복식;이영훈
    • Journal of the Korea Society of Computer and Information
    • /
    • v.7 no.4
    • /
    • pp.89-93
    • /
    • 2002
  • This paper describes an ALU core which is suitable for 32-bit DSP This ALU operates in 32-bit data and occupies the third stage, execution, among 5 stage pipeline structure. The supplied functions of the ALU are arithmetic operations, logical operations, shifting, and so on. For the implementation of this ALU core, each functional block is described by HDL. And the functional verification of the ALU core is performed through HDL simulation. This ALU is designed to use the 32-bit DSP.

  • PDF

Low-Complexity Lens-shading Correction Algorithm based on Piece-wise Linear Model (낮은 복잡도를 가지는 구간선형 모델 기반 렌즈음영왜곡 보상 알고리즘)

  • Lee, Bora;Park, Hyun Sang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2011.11a
    • /
    • pp.49-52
    • /
    • 2011
  • 본 논문에서는 구간선형 모델을 적용하여 낮은 복잡도를 가지는 LSC(Lens-Shading Correction) 알고리즘을 제안한다. 제안한 알고리즘은 각 화소와 렌즈 중심점으로부터 거리를 정수형으로 계산하고, 이 정수를 거리에 대한 LSC 이득값이 저장된 LUT(Look-Up Table)에 대한 주소로 적용하여, 입력 화소 값에 곱함으로써 LSC를 수행한다. 거리를 구하려면 제곱근 회로가 추가되어야 한다. LUT에 저장된 이득값은 원점으로부터의 거리에 대한 평균 이득값을 저장하고 있기 때문에, 제곱근 계산에 높은 정밀도를 할애하여도 LSC 보상된 영상의 화질에 미치는 영향은 높지 않으므로 정수형 제곱근 연산을 수행한다. 제곱근 계산은 구간 선형화하여 단지 덧셈과 쉬프트 연산만으로 제곱근 연산을 완료할 수 있도록 간략화 하였다. 제안한 알고리즘을 양산 중인 일반 카메라 모듈에 적용한 결과, 카메라모듈 제조업체의 LSC 평가 기준을 상회하는 수준으로 나타나며, 구현될 하드웨어 복잡도가 매우 낮아서 모바일 카메라 구현에 매우 적합하다.

  • PDF

A VLSI Architecture of an 8$\times$8 OICT for HDTV Application (HDTU용 8$\times$8 최적화 정수형 여현 변환의 VLSE 구조)

  • 송인준;황상문;이종하;류기수;곽훈성
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.36T no.1
    • /
    • pp.1-7
    • /
    • 1999
  • We present VLSI architecture for a high performance 2-D DCT processor which is used compressing system of real time image processing or HDTV using fast computational algorithm of the Optimized Integer Cosine Transform(OICT). The coefficients of the OICT are integer, so the OICT performs only the integer operations for both forward and inverse transform. Therefore the proposed architecture could be greatly enhanced in improving the speed, reduced the hardware cost considerably by replacing the multiplication operations with shift and addition operations compared with DCT which performs floating-point operations.

  • PDF

Improved NTRUSign protocol (개선된 NTRUSign 프로토콜)

  • 배성현;황성민;최영근;김순자
    • Proceedings of the Korea Institutes of Information Security and Cryptology Conference
    • /
    • 2002.11a
    • /
    • pp.409-414
    • /
    • 2002
  • 1996년 Crypto의 럼프세션에 소개된 NTRU는 잘려진 다항식 환(truncated polynomial ring)을 기반으로 작은 정수의 덧셈과 곱셈, 그리고 쉬프트(shift)연산만 이루어지는 암호시스템이다. 그 응용분야 중 NTRU기반 서명기법은 몇 번의 개정에 의해 2001년 NTRUSign이 소개되었다. NTRUSign은 기존의 NSS들의 단점을 보완하였지만 디지털 문서로부터 서명 생성시 순열기법이 아닌 것과 서명 복사본으로부터의 공격이 가능함이 최근 밝혀졌다. 이에 본 논문에서는 NTRU 암호시스템의 안전성을 기반으로 생성한 공유키와 대칭키 암호를 결합해 개선된 서명(Improved NTRUSign) 프로토콜을 제안한다.

  • PDF

Design of Bit Manipulation Accelerator fo Communication DSP (통신용 DSP를 위한 비트 조작 연산 가속기의 설계)

  • Jeong Sug H.;Sunwoo Myung H.
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.42 no.8 s.338
    • /
    • pp.11-16
    • /
    • 2005
  • This paper proposes a bit manipulation accelerator (BMA) having application specific instructions, which efficiently supports scrambling, convolutional encoding, puncturing, and interleaving. Conventional DSPs cannot effectively perform bit manipulation functions since かey have multiply accumulate (MAC) oriented data paths and word-based functions. However, the proposed accelerator can efficiently process bit manipulation functions using parallel shift and Exclusive-OR (XOR) operations and bit jnsertion/extraction operations on multiple data. The proposed BMA has been modeled by VHDL and synthesized using the SEC $0.18\mu m$ standard cell library and the gate count of the BMA is only about 1,700 gates. Performance comparisons show that the number of clock cycles can be reduced about $40\%\sim80\%$ for scrambling, convolutional encoding and interleaving compared with existing DSPs.

Implementation of a Modified Cubic Convolution Scaler for Low Computational Complexity (저연산을 위한 수정된 3차 회선 스케일러 구현)

  • Jun, Young-Hyun;Yun, Jong-Ho;Park, Jin-Sung;Choi, Myung-Ryul
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.7
    • /
    • pp.838-845
    • /
    • 2007
  • In this paper, we propose a modified cubic convolution scaler for the enlargement or reduction of digital images. The proposed method has less computational complexity than the cubic convolution method. In order to reduce the computational complexity, we use the linear function of the cubic convolution and the difference value of adjacent pixels for selecting interpolation methods. We employ adders and barrel shifts to calculate weights of the proposed method. The proposed method is compared with the conventional one for the computational complexity and the image quality. It has been designed and verified by HDL(Hardware Description Language), and synthesized using Xilinx Virtex FPGA.

  • PDF

The Design and Implementation of a Graphical Education System on the Structure and the Operation of ALU (ALU 구조와 단계별 연산과정을 그래픽 형태로 학습하는 교육 시스템의 설계 및 구현)

  • Ahn, Syung-Og;Nam, Soo-Jeong
    • The Journal of Engineering Research
    • /
    • v.2 no.1
    • /
    • pp.31-37
    • /
    • 1997
  • This paper describes the design and implementation of 8 bit ALU graphic simulator which helps students who study the structure and operation course of general ALU. ALU of this paper consists of three parts, arithmetic circuit, logic circuit, and shifter. Each of them performs as follows. Arithmetic circuit performs arithmetic operation such as addition, subtraction, 1 increment, 1 decrement, 2's complement, logic circuit performs logic operation such as OR, AND, XOR, NOT, and shifter performs shift operation and transfers the result of circuits of arithmetic, logic to data bus. The instructions which relate to these basic ALU functions was selected from Z80 instructions and ALU circuit was designed with those instructions and this designed ALU circuit was implemented on graphic screen. And all state of this data operation course in ALU was showed by bit and logic gate unit.

  • PDF

A Study on the design of Hilbert transformer using the MAG Algorithm (MAG 알고리즘을 이용한 힐버트 변환기의 설계에 관한 연구)

  • Lee, Young-seock
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.7 no.3
    • /
    • pp.121-125
    • /
    • 2014
  • A hardware implementation of Hilbert transform is indespensible element in DSP system, but it suffers form a high complexity of system level hardware resulted in a large amount of the used gate. In this paper, we implemented the Hilbert transformer using MAG algorithm that reduces the complexity of hardware.

Using MAG Algorithm for Reducing Hardware in Hilbert Transformer Design (최소 가산 그래프 알고리즘에 의한 힐버트 변환기 설계에 관한 연구)

  • Lee, YoungSeock
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.2 no.4
    • /
    • pp.45-51
    • /
    • 2009
  • A hardware implementation of Hilbert transform is indespensible element in DSP system, but it suffers form a high complexity of system level hardware resulted in a large amount of the used gate. In this paper, we implemented the Hilbert transformer using MAG algorithm that reduces the complexity of hardware.

  • PDF

Design of Serial Decimal Multiplier using Simultaneous Multiple-digit Operations (동시연산 다중 digit을 이용한 직렬 십진 곱셈기의 설계)

  • Yu, ChangHun;Kim, JinHyuk;Choi, SangBang
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.4
    • /
    • pp.115-124
    • /
    • 2015
  • In this paper, the method which improves the performance of a serial decimal multiplier, and the method which operates multiple-digit simultaneously are proposed. The proposed serial decimal multiplier reduces the delay by removing encoding module that generates 2X, 4X multiples, and by generating partial product using shift operation. Also, this multiplier reduces the number of operations using multiple-digit operation. In order to estimate the performance of the proposed multiplier, we synthesized the proposed multiplier with design compiler with SMIC 110nm CMOS library. Synthesis results show that the area of the proposed serial decimal multiplier is increased by 4%, but the delay is reduced by 5% compared to existing serial decimal multiplier. In addition, the trade off between area and latency with respect to the number of concurrent operations in the proposed multiple-digit multiplier is confirmed.