• Title/Summary/Keyword: high speed multiplication

Search Result 107, Processing Time 0.03 seconds

Design of High-Speed Parallel Multiplier on Finite Fields GF(3m) (유한체 GF(3m)상의 고속 병렬 곱셈기의 설계)

  • Seong, Hyeon-Kyeong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.1-10
    • /
    • 2015
  • In this paper, we propose a new multiplication algorithm for primitive polynomial with all 1 of coefficient in case that m is odd and even on finite fields $GF(3^m)$, and design the multiplier with parallel input-output module structure using the presented multiplication algorithm. The proposed multiplier is designed $(m+1)^2$ same basic cells. Since the basic cells have no a latch circuit, the multiplicative circuit is very simple and is short the delay time $T_A+T_X$ per cell unit. The proposed multiplier is easy to extend the circuit with large m having regularity and modularity by cell array, and is suitable to the implementation of VLSI circuit.

High Performance Integer Multiplier on FPGA with Radix-4 Number Theoretic Transform

  • Chang, Boon-Chiao;Lee, Wai-Kong;Goi, Bok-Min;Hwang, Seong Oun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2816-2830
    • /
    • 2022
  • Number Theoretic Transform (NTT) is a method to design efficient multiplier for large integer multiplication, which is widely used in cryptography and scientific computation. On top of that, it has also received wide attention from the research community to design efficient hardware architecture for large size RSA, fully homomorphic encryption, and lattice-based cryptography. Existing NTT hardware architecture reported in the literature are mainly designed based on radix-2 NTT, due to its small area consumption. However, NTT with larger radix (e.g., radix-4) may achieve faster speed performance in the expense of larger hardware resources. In this paper, we present the performance evaluation on NTT architecture in terms of hardware resource consumption and the latency, based on the proposed radix-2 and radix-4 technique. Our experimental results show that the 16-point radix-4 architecture is 2× faster than radix-2 architecture in expense of approximately 4× additional hardware. The proposed architecture can be extended to support the large integer multiplication in cryptography applications (e.g., RSA). The experimental results show that the proposed 3072-bit multiplier outperformed the best 3k-multiplier from Chen et al. [16] by 3.06%, but it also costs about 40% more LUTs and 77.8% more DSPs resources.

A Study on the Implementation of Digital Filters with Reduced Memory Space and Dual Impulse Response Types (기억용량 절약과 순회방식 선택이 가능한 디지털 필터의 구성에 관한 연구)

  • Park, In Jung;Rhee, Tae Won
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.23 no.6
    • /
    • pp.950-956
    • /
    • 1986
  • In this paper, a direct addressing mode of a microprocessor is introduced to save memory capacity, and also a dedicated digital filter is constructed to speed up the filter processing and to enable an easy selection of the impulse response types. A theoretical analysis has been conducted on the errors caused by the finite word klength, rounding-off and multiplication procedures. The digital filter designed by the proposed method is made into a module which can function as a 7th-order recursive or a 14-order nonrecursive type with a simples witch operation. The proposed filter is implemented on a printed-circuit board. The frequency characteristics of this filter can be controlled by the multiplication values stored in ROMs. A low-pass, a high-pass and a band-pass filter have been designed and their frequency characteristics are verified by actual measurements. For a order higher filer, two filter modules have been cascaded into an integrated filter of 23rd-order non-recursive low-pass type and a 12th-order recursive multiband type. Their frequency characteirstics have been found to agree with the theory.

  • PDF

A Study on High-Repetition Rate Optical-Pulse for OTDM System Using Fiber Loop Mirror (OTDM 시스템을 위한 광섬유 루프 미러를 이용한 고 반복률 펄스 발생에 관한 연구)

  • 최원석;정찬권;김선엽;강영진
    • Proceedings of the IEEK Conference
    • /
    • 2000.06b
    • /
    • pp.330-333
    • /
    • 2000
  • With the recent development of the ultrahigh-speed optical time division multiplexed system, high-repetition rate optical-pulse stream generation is necessary. This is different from conventional approaches, which use fiber or integrated waveguide delay line circuits. The high-repetition-rate optical-pulse multiplication phenomenon occurs when the optical pulse's spectral width is greater than the transfer bandwidth of the coupler used. From the analysis, the output repetition rate can be controlled by using fiber couplers with different equivalent transfer bandwidths. The pulse seperation spacing is controlled by number of cascaded coupler in optical loop mirror coupler scheme.

  • PDF

Approximate Multiplier with High Density, Low Power and High Speed using Efficient Partial Product Reduction (효율적인 부분 곱 감소를 이용한 고집적·저전력·고속 근사 곱셈기)

  • Seo, Ho-Sung;Kim, Dae-Ik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.4
    • /
    • pp.671-678
    • /
    • 2022
  • Approximate computing is an computational technique that is acceptable degree of inaccurate results of accurate results. Approximate multiplication is one of the approximate computing methods for high-performance and low-power computing. In this paper, we propose a high-density, low-power, and high-speed approximate multiplier using approximate 4-2 compressor and improved full adder. The approximate multiplier with approximate 4-2 compressor consists of three regions of the exact, approximate and constant correction regions, and we compared them by adjusting the size of region by applying an efficient partial product reduction. The proposed approximate multiplier was designed with Verilog HDL and was analyzed for area, power and delay time using Synopsys Design Compiler (DC) on a 25nm CMOS process. As a result of the experiment, the proposed multiplier reduced area by 10.47%, power by 26.11%, and delay time by 13% compared to the conventional approximate multiplier.

An Efficient MAC Unit for High-Security RSA Cryptoprocessors (고비도 RSA 프로세서에 적용 가능한 효율적인 누적곱셈 연산기)

  • Moon, San-Gook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.06a
    • /
    • pp.778-781
    • /
    • 2007
  • RSA crypto-processors equipped with more than 1024 bits of key space handle the entire key stream in units of blocks. The RSA processor which will be the target design in this paper defines the length of the basic word as 128 bits, and uses an 256-bits register as the accumulator. For efficient execution of 128-bit multiplication, 32b*32b multiplier was designed and adopted and the results are stored in 8 separate 128-bit registers according to the status flag. In this paper, an efficient method to execute 128-bit MAC (multiplication and accumulation) operation is proposed. The suggested method pre-analyze the all possible cases so that the MAC unit can remove unnecessary calculations to speed up the execution. The proposed architecture protype of the MAC unit was automatically synthesized, and successfully operated at 20MHz, which will be the operation frequency in the target RSA processor.

  • PDF

Construction of High-Speed Parallel Multiplier on Finite Fields GF(3m) (유한체 GF(3m)상의 고속 병렬 승산기의 구성)

  • Choi, Yong-Seok;Park, Seung-Yong;Seong, Hyeon-Kyeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.3
    • /
    • pp.510-520
    • /
    • 2011
  • In this paper, we propose a new multiplication algorithm for primitive polynomial with all 1 of coefficient in case that m is odd and even on finite fields $GF(3^m)$, and compose the multiplier with parallel input-output module structure using the presented multiplication algorithm. The proposed multiplier is designed $(m+1)^2$ same basic cells that have a mod(3) addition gate and a mod(3) multiplication gate. Since the basic cells have no a latch circuit, the multiplicative circuit is very simple and is short the delay time $T_A+T_X$ per cell unit. The proposed multiplier is easy to extend the circuit with large m having regularity and modularity by cell array, and is suitable to the implementation of VLSI circuit.

Review of Injection-Locked Oscillators

  • Choo, Min-Seong;Jeong, Deog-Kyoon
    • Journal of Semiconductor Engineering
    • /
    • v.1 no.1
    • /
    • pp.1-12
    • /
    • 2020
  • Handling precise timing in high-speed transceivers has always been a primary design target to achieve better performance. Many different approaches have been tried, and one of those is utilizing the beneficial nature of injection locking. Though the phenomenon was not intended for building integrated circuits at first, its coupling effect between neighboring oscillators has been utilized deliberately. Consequently, the dynamics of the injection-locked oscillator (ILO) have been explored, starting from R. Adler. As many aspects of the ILO were revealed, further studies followed to utilize the technique in practice, suggesting alternatives to the conventional frequency syntheses, which tend to be complicated and expensive. In this review, the historical analysis techniques from R. Adler are studied for better comprehension with proper notation of the variables, resulting in numerical results. In addition, how the timing jitter or phase noise in the ILO is attenuated from noise sources is presented in contrast to the clock generators based on the phase-locked loop (PLL). Although the ILO is very promising with higher cost effectiveness and better noise immunity than other schemes, unless correctly controlled or tuned, the promises above might not be realized. In order to present the favorable conditions, several strategies have been explored in diverse applications like frequency multiplication, data recovery, frequency division, clock distribution, etc. This paper reviews those research results for clock multiplication and data recovery in detail with their advantages and disadvantages they are referring to. Through this review, the readers will hopefully grasp the overall insight of the ILO, as well as its practical issues, in order to incorporate it on silicon successfully.

Multiplication optimization technique for Elliptic Curve based sensor network security (Elliptic curve기반 센서네트워크 보안을 위한 곱셈 최적화 기법)

  • Seo, Hwa-Jeong;Kim, Ho-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.8
    • /
    • pp.1836-1842
    • /
    • 2010
  • Sensor network, which is technology to realize the ubiquitous environment, recently, could apply to the field of Mechanic & electronic Security System, Energy management system, Environment monitoring system, Home automation and health care application. However, feature of wireless networking of sensor network is vulnerable to eavesdropping and falsification about message. Presently, PKC(public key cryptography) technique using ECC(elliptic curve cryptography) is used to build up the secure networking over sensor network. ECC is more suitable to sensor having restricted performance than RSA, because it offers equal strength using small size of key. But, for high computation cost, ECC needs to enhance the performance to implement over sensor. In this paper, we propose the optimizing technique for multiplication, core operation in ECC, to accelerate the speed of ECC.

The Implementation of Back Propagation Neural Network using the Residue Number System (잉여수계를 이용한 역전파 신경회로망 구현)

  • 홍봉화;이호선
    • The Journal of Information Technology
    • /
    • v.2 no.2
    • /
    • pp.145-161
    • /
    • 1999
  • This paper proposes a high speed back propagation neural networks which uses the residue number system. making the high speed operation possible without carry propagation Consisting of MAC(Multiplication and Accumulation) operator unit using Residue number system and sigmoid function operator unit using Mixed Residue Conversion is designed, The Designed circuits are descripted by VHDL and synthesized by Compass tools. Result of simulations shows that critical path delay time is about 19nsec and the size can be reduced to 40% compared to the neural networks implemented by the real number operation unit. The proposed design circuits can be implemented in parallel distributed processing system with desired real time processing.

  • PDF