통합 검색 | Korea Science

Design and Analysis of Efficient Parallel Hardware Prime Generators

Kim, Dong Kyue;Choi, Piljoo;Lee, Mun-Kyu;Park, Heejin
- JSTS:Journal of Semiconductor Technology and Science
- /
- 제16권5호
- /
- pp.564-581
- /
- 2016
We present an efficient hardware prime generator that generates a prime p by combining trial division and Fermat test in parallel. Since the execution time of this parallel combination is greatly influenced by the number k of the smallest odd primes used in the trial division, it is important to determine the optimal k to create the fastest parallel combination. We present probabilistic analysis to determine the optimal k and to estimate the expected running time for the parallel combination. Our analysis is conducted in two stages. First, we roughly narrow the range of optimal k by using the expected values for the random variables used in the analysis. Second, we precisely determine the optimal k by using the exact probability distribution of the random variables. Our experiments show that the optimal k and the expected running time determined by our analysis are precise and accurate. Furthermore, we generalize our analysis and propose a guideline for a designer of a hardware prime generator to determine the optimal k by simply calculating the ratio of M to D, where M and D are the measured running times of a modular multiplication and an integer division, respectively.
https://doi.org/10.5573/JSTS.2016.16.5.564 인용 PDF KSCI

Whirlpool 해쉬 함수의 효율적인 하드웨어 구현 (An Efficient Hardware Implementation of Whirlpool Hash Function)

박진철;신경욱
- 한국정보통신학회:학술대회논문집
- /
- 한국정보통신학회 2012년도 추계학술대회
- /
- pp.263-266
- /
- 2012
본 논문에서는 ISO/IEC 10118-3의 표준인 Whirlpool 해쉬 함수의 효율적인 하드웨어 설계와 FPGA 검증에 대해 기술한다. Pipelined small LUT를 이용하여 동작 타이밍을 최적화하였으며, Whirlpool 블록암호와 key schedule을 병렬로 사용하여 throughput을 개선하였다. 키 스케쥴에서 키 덧셈부분에 rom과 xor 게이트를 사용하지 않고 인버터와 mux로 구현하여 면적을 최적화하였다. Virtex5-XC5VSX50T를 사용하여 FPGA 검증을 하였고 최대 동작 주파수는 약 151MHz이며, 약 950Mbps의 성능을 가진다.
PDF

Energy Efficient Architecture Using Hardware Acceleration for Software Defined Radio Components

Liu, Chen;Granados, Omar;Duarte, Rolando;Andrian, Jean
- Journal of Information Processing Systems
- /
- 제8권1호
- /
- pp.133-144
- /
- 2012
In order to make cognitive radio systems a practical technology to be deployed in real-world scenarios, the core Software Defined Radio (SDR) systems must meet the stringent requirements of the target application, especially in terms of performance and energy consumption for mobile platforms. In this paper we present a feasibility study of hardware acceleration as an energy-efficient implementation for SDR. We identified the amplifier function from the Software Communication Architecture (SCA) for hardware acceleration since it is one of the functions called for most frequently and it requires intensive floating-point computation. Then, we used the Virtex5 Field-Programmable Gate Array (FPGA) to perform a comparison between compiler floating-point support and the on-chip floating-point support. By enabling the on-chip floating-point unit (FPU), we obtained as high as a 2X speedup and 50% of the overall energy reduction. We achieved this with an increase of the power consumption by no more than 0.68%. This demonstrates the feasibility of the proposed approach.
https://doi.org/10.3745/JIPS.2012.8.1.133 인용 PDF KSCI

A Novel Spiral-Type Motion Estimation Architecture for H.264/AVC

Hirai, Naoyuki;Song, Tian;Liu, Yizhong;Shimamoto, Takashi
- JSTS:Journal of Semiconductor Technology and Science
- /
- 제10권1호
- /
- pp.37-44
- /
- 2010
New features of motion compensation, such as variable block size and multiple reference frames are introduced in H.264/AVC. However, these new features induce significant implementation complexity increases. In this paper, an efficient architecture for spiral-type motion estimation is proposed. First, we propose a hardware-friendly spiral search order. Then, an efficient processing element (PE) architecture for ME is proposed to achieve the proposed search order. The improved PE enables one-pixel-move of the reference pixel data to top, bottom, right, and left by four ports for input and output. Moreover, the parallel calculation architecture to calculate all block size with the SAD of 4x4 is introduced in the proposed architecture. As the result of hardware implementation, the hardware cost is about 145k gates. Maximum clock frequency is 134 MHz in the case of FPGA (Xilinx Vertex5) implementation.
https://doi.org/10.5573/JSTS.2010.10.1.037 인용 PDF KSCI

이진 에드워즈 곡선 공개키 암호를 위한 257-비트 점 스칼라 곱셈의 효율적인 하드웨어 구현 (An Efficient Hardware Implementation of 257-bit Point Scalar Multiplication for Binary Edwards Curves Cryptography)

김민주;정영수;신경욱
- 한국정보통신학회:학술대회논문집
- /
- 한국정보통신학회 2022년도 춘계학술대회
- /
- pp.246-248
- /
- 2022
Bernstein이 제안한 새로운 타원곡선 형태인 이진 에드워즈 곡선 (binary Edwards curves; BEdC)는 예외점이 없어 완전한 덧셈 법칙이 만족한다. 본 논문에서는 투영 좌표계를 적용한 BEdC 상의 점 스칼라 곱셈의 효율적인 하드웨어 구현에 대해 기술한다. 점 스칼라 곱셈을 위해 modified Montgomery ladder 알고리듬을 적용하였으며, 257-비트 이진 덧셈기와 이진 제곱기, 32-비트 이진 곱셈기를 사용하여 하위 이진체 연산을 구현했다. Zynq UltraScale+ MPSoC 디바이스에 구현하여 설계된 BEdC 크립토 코어를 검증하였으며, 점 스칼라 곱셈 연산에 521,535 클록 사이클이 소요된다.
PDF

보안 하드웨어 연산 최소화를 통한 효율적인 속성 기반 전자서명 구현 (Efficient Attribute Based Digital Signature that Minimizes Operations on Secure Hardware)

윤정준;이정혁;김지혜;오현옥
- 정보과학회 논문지
- /
- 제44권4호
- /
- pp.344-351
- /
- 2017
속성 기반 서명은 속성을 가지는 서명키를 사용하여 속성 술어를 기반으로 하는 서명을 생성하는 암호 방식이다. 속성 기반 서명에서 서명을 생성하는 동안 서명키가 유출된다면, 해당 서명키에 대한 서명이 위조될 수 있는 문제가 발생한다. 따라서 서명 생성은 보안이 보장되는 하드웨어에서 수행되어야 한다. 이러한 하드웨어를 보안 하드웨어라고 명명한다. 그러나 보안 하드웨어는 연산속도가 느리기 때문에 속성 기반 서명과 같은 많은 연산을 빠른 시간 안에 수행하기에는 적합하지 않다. 이 논문은 속성 기반 서명의 연산을 분리하여 성능이 좋은 일반 하드웨어와 보안 하드웨어로 이루어지는 시스템에서 효율적으로 사용가능한 속성 기반 서명 기법을 제안한다. 제안하는 기법은 기존에 존재하는 임의의 속성 기반 서명과 일반 전자서명으로 설계가 가능하며, 속성 기반 서명이 안전하지 않은 환경에서 수행되더라도 일반 전자서명을 보안 하드웨어에서 수행함으로써 안전성을 보장한다. 제안된 논문은 기존의 속성 기반 서명을 보안 하드웨어에서 생성하는 것에 비해서 11배의 성능향상을 보인다.
https://doi.org/10.5626/JOK.2017.44.4.344 인용 KSCI

직렬형 HEV의 최적 용량산정과 효율적 운전방안 (The Optimal Sizing and Efficient Driving Scheme of Series HEV)

허민호
- 전력전자학회:학술대회논문집
- /
- 전력전자학회 2000년도 전력전자학술대회 논문집
- /
- pp.651-656
- /
- 2000
This paper describes the optimal sizing of each component using computer simulation and presents the efficient operating scheme of series HEV using hardware simulator the equivalent system. As the sizing method of components have been experimental and empirical it is needed to spend much time and development cost. however the results of computer simulation will set the optimal sizing of components in short time. There are two type of driving control power-tracking mode and load-levelling mode in series HEV. This paper presents that series HEV be operated in the load-levelling mode which is more efficient that power-tracking mode.
PDF

OFDM 주파수 옵셋 동기화부 보상 블록의 저전력 설계 (A High-speed/Low-power OFDM Frequency Offset Synchronization Compensation Block Design)

한재웅;장영범
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2008년도 하계종합학술대회
- /
- pp.201-202
- /
- 2008
In this paper, an efficient frequency offset compensation design for OFDM(Orthogonal Frequency Division Multiplexing) is proposed. The conventional CORDIC(COordinate Rotation Digital Computer) algorithm for frequency offset compensation utilizes CORDIC hardware and complex multiplier. But, proposed structure utilizes only one CORDIC hardware.
PDF

전자서명을 위한 ECC기반 유한체 산술 연산기 구현에 관한 연구 (Design of finite field arithmtic for EC-KCDSA)

최경문;황정태;류상준;김영철
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2003년도 하계종합학술대회 논문집 II
- /
- pp.935-938
- /
- 2003
The performance of elliptic curve based on public key cryptosystems is mainly appointed by the efficiency of the underlying finite field arithmetic. This work describes a finite field multiplier and divider which is implemented using SystemC. Also this present an efficient hardware for performing the elliptic curve point multiplication using the polynomial basis representation. In order to improve the speed of the multiplier with as a little extra hardware as possible, adopted hybrid finite field multiplication and finite field divider.
PDF

Meshfree/GFEM in hardware-efficiency prospective

Tian, Rong
- Interaction and multiscale mechanics
- /
- 제6권2호
- /
- pp.197-210
- /
- 2013
A fundamental trend of processor architecture evolving towards exaflops is fast increasing floating point performance (so-called "free" flops) accompanied by much slowly increasing memory and network bandwidth. In order to fully enjoy the "free" flops, a numerical algorithm of PDEs should request more flops per byte or increase arithmetic intensity. A meshfree/GFEM approximation can be the class of the algorithm. It is shown in a GFEM without extra dof that the kind of approximation takes advantages of the high performance of manycore GPUs by a high accuracy of approximation; the "expensive" method is found to be reversely hardware-efficient on the emerging architecture of manycore.
https://doi.org/10.12989/imm.2013.6.2.197 인용 KSCI

검색결과 1,130건 처리시간 0.032초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)