• Title/Summary/Keyword: Clock performance

Search Result 564, Processing Time 0.025 seconds

A Scalable Montgomery Modular Multiplier (확장 가능형 몽고메리 모듈러 곱셈기)

  • Choi, Jun-Baek;Shin, Kyung-Wook
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.625-633
    • /
    • 2021
  • This paper describes a scalable architecture for flexible hardware implementation of Montgomery modular multiplication. Our scalable modular multiplier architecture, which is based on a one-dimensional array of processing elements (PEs), performs word parallel operation and allows us to adjust computational performance and hardware complexity depending on the number of PEs used, NPE. Based on the proposed architecture, we designed a scalable Montgomery modular multiplier (sMM) core supporting eight field sizes defined in SEC2. Synthesized with 180-nm CMOS cell library, our sMM core was implemented with 38,317 gate equivalents (GEs) and 139,390 GEs for NPE=1 and NPE=8, respectively. When operating with a 100 MHz clock, it was evaluated that 256-bit modular multiplications of 0.57 million times/sec for NPE=1 and 3.5 million times/sec for NPE=8 can be computed. Our sMM core has the advantage of enabling an optimized implementation by determining the number of PEs to be used in consideration of computational performance and hardware resources required in application fields, and it can be used as an IP (intellectual property) in scalable hardware design of elliptic curve cryptography (ECC).

A Security SoC embedded with ECDSA Hardware Accelerator (ECDSA 하드웨어 가속기가 내장된 보안 SoC)

  • Jeong, Young-Su;Kim, Min-Ju;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.7
    • /
    • pp.1071-1077
    • /
    • 2022
  • A security SoC that can be used to implement elliptic curve cryptography (ECC) based public-key infrastructures was designed. The security SoC has an architecture in which a hardware accelerator for the elliptic curve digital signature algorithm (ECDSA) is interfaced with the Cortex-A53 CPU using the AXI4-Lite bus. The ECDSA hardware accelerator, which consists of a high-performance ECC processor, a SHA3 hash core, a true random number generator (TRNG), a modular multiplier, BRAM, and control FSM, was designed to perform the high-performance computation of ECDSA signature generation and signature verification with minimal CPU control. The security SoC was implemented in the Zynq UltraScale+ MPSoC device to perform hardware-software co-verification, and it was evaluated that the ECDSA signature generation or signature verification can be achieved about 1,000 times per second at a clock frequency of 150 MHz. The ECDSA hardware accelerator was implemented using hardware resources of 74,630 LUTs, 23,356 flip-flops, 32kb BRAM, and 36 DSP blocks.

Timing Driven Analytic Placement for FPGAs (타이밍 구동 FPGA 분석적 배치)

  • Kim, Kyosun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.7
    • /
    • pp.21-28
    • /
    • 2017
  • Practical models for FPGA architectures which include performance- and/or density-enhancing components such as carry chains, wide function multiplexers, and memory/multiplier blocks are being applied to academic FPGA placement tools which used to rely on simple imaginary models. Previously the techniques such as pre-packing and multi-layer density analysis are proposed to remedy issues related to such practical models, and the wire length is effectively minimized during initial analytic placement. Since timing should be optimized rather than wire length, most previous work takes into account the timing constraints. However, instead of the initial analytic placement, the timing-driven techniques are mostly applied to subsequent steps such as placement legalization and iterative improvement. This paper incorporates the timing driven techniques, which check if the placement meets the timing constraints given in the standard SDC format, and minimize the detected violations, with the existing analytic placer which implements pre-packing and multi-layer density analysis. First of all, a static timing analyzer has been used to check the timing of the wire-length minimized placement results. In order to minimize the detected violations, a function to minimize the largest arrival time at end points is added to the objective function of the analytic placer. Since each clock has a different period, the function is proposed to be evaluated for each clock, and added to the objective function. Since this function can unnecessarily reduce the unviolated paths, a new function which calculates and minimizes the largest negative slack at end points is also proposed, and compared. Since the existing legalization which is non-timing driven is used before the timing analysis, any improvement on timing is entirely due to the functions added to the objective function. The experiments on twelve industrial examples show that the minimum arrival time function improves the worst negative slack by 15% on average whereas the minimum worst negative slack function improves the negative slacks by additional 6% on average.

Drying of Rough Rice by Solar Collectors (태양(太陽) 열(熱 )집열기(集熱機)를 이용(利用)한 벼의 건조(乾燥)에 관(關)한 연구(硏究))

  • Chang, Kyu-Seob;Kim, Man-Soo;Kim, Dong-Man
    • Korean Journal of Food Science and Technology
    • /
    • v.11 no.4
    • /
    • pp.264-272
    • /
    • 1979
  • The flat-plate and tubular soar collectors were designed and constructed for drying the rough rice, and the performance of the collectors and drying effect were investigated when rough rice was packed in grain bin connected to collectors. Average-monthly radiation on a horizontal surface based on bright sunshine in Daejeon area during 1978 was the highest as $16,814\;KJ/m^2{\cdot}day$ in May and the lowest as $4,254\;KJ/m^2{\cdot}day$ in December, and significane was not recognized between the calculated and recorded values. The thermal effciency of collectors were increased as radiation increased during drying period and the average thermal effciency of flat-plate and tubular collectors in 11 to 12 o'clock a.m were 28.12 and 16.75%, respectively. The average inlet temperature of grain bin at 12 o'clock was shown as 20.02 at control 40.5 at grain bin connected to tubular collector and $55.1^{\circ}C$ at grain bin connected to flat-plate collector. In 25 cm rough rice depth in grain bin, tim taken for drying from initial moisture content at 27.4 to decrease upto 17.0% (14.5 % on wet basis) were 32 in control, 18 in grain bin connected to tubular collector and 11 hrs to flat-plate collector, and grain depth influenced drying rate remarkably. In the view point of drying characteristics, drying pattern showed initially falling-rate to constant-rate period finally.

  • PDF

Differences in the Joint Movements and Muscle Activities of Novice according to Cycle Pedal Type

  • Seo, Jeong-Woo;Kim, Dae-Hyeok;Yang, Seung-Tae;Kang, Dong-Won;Choi, Jin-Seung;Kim, Jin-Hyun;Tack, Gye-Rae
    • Korean Journal of Applied Biomechanics
    • /
    • v.26 no.2
    • /
    • pp.237-242
    • /
    • 2016
  • Objective: The purpose of this study was to compare the joint movements and muscle activities of novices according to pedal type (flat, clip, and cleat pedal). Method: Nine novice male subjects (age: $24.4{\pm}1.9years$, height: $1.77{\pm}0.05m$, weight: $72.4{\pm}7.6kg$, shoe size: $267.20{\pm}7.50mm$) participated in 3-minute, 60-rpm cycle pedaling tests with the same load and cadence. Each of the subject's saddle height was determined by the $155^{\circ}$ knee flexion angle when the pedal crank was at the 6 o'clock position ($25^{\circ}$ knee angle method). The muscle activities of the vastus lateralis, tibialis anterior, biceps femoris, and gastrocnemius medialis were compared by using electromyography during 4 pedaling phases (phase 1: $330{\sim}30^{\circ}$, phase 2: $30{\sim}150^{\circ}$, phase 3: $150{\sim}210^{\circ}$, and phase 4: $210{\sim}330^{\circ}$). Results: The knee joint movement (range of motion) and maximum dorsiflexion angle of the ankle joint with the flat pedal were larger than those of the clip and cleat pedals. The maximum plantarflexion timing with the flat and clip pedals was faster than that of the flat pedal. Electromyography revealed that the vastus lateralis muscle activity with the flat pedal was greater than that with the clip and cleat pedals. Conclusion: With the clip and cleat pedals, the joint movements were limited but the muscle activities were more effective than that with the flat pedal. The novice cannot benefit from the clip and cleat pedals regardless of their pull-up pedaling advantage. Therefore, the novice should perform the skilled pulling-up pedaling exercise in order to benefit from the clip and cleat pedals in terms of pedaling performance.

R Based Parallelization of a Climate Suitability Model to Predict Suitable Area of Maize in Korea (국내 옥수수 재배적지 예측을 위한 R 기반의 기후적합도 모델 병렬화)

  • Hyun, Shinwoo;Kim, Kwang Soo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.19 no.3
    • /
    • pp.164-173
    • /
    • 2017
  • Alternative cropping systems would be one of climate change adaptation options. Suitable areas for a crop could be identified using a climate suitability model. The EcoCrop model has been used to assess climate suitability of crops using monthly climate surfaces, e.g., the digital climate map at high spatial resolution. Still, a high-performance computing approach would be needed for assessment of climate suitability to take into account a complex terrain in Korea, which requires considerably large climate data sets. The objectives of this study were to implement a script for R, which is an open source statistics analysis platform, in order to use the EcoCrop model under a parallel computing environment and to assess climate suitability of maize using digital climate maps at high spatial resolution, e.g., 1 km. The total running time reduced as the number of CPU (Central Processing Unit) core increased although the speedup with increasing number of CPU cores was not linear. For example, the wall clock time for assessing climate suitability index at 1 km spatial resolution reduced by 90% with 16 CPU cores. However, it took about 1.5 time to compute climate suitability index compared with a theoretical time for the given number of CPU. Implementation of climate suitability assessment system based on the MPI (Message Passing Interface) would allow support for the digital climate map at ultra-high spatial resolution, e.g., 30m, which would help site-specific design of cropping system for climate change adaptation.

LASPI: Hardware friendly LArge-scale stereo matching using Support Point Interpolation (LASPI: 지원점 보간법을 이용한 H/W 구현에 용이한 스테레오 매칭 방법)

  • Park, Sanghyun;Ghimire, Deepak;Kim, Jung-guk;Han, Youngki
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.932-945
    • /
    • 2017
  • In this paper, a new hardware and software architecture for a stereo vision processing system including rectification, disparity estimation, and visualization was developed. The developed method, named LArge scale stereo matching method using Support Point Interpolation (LASPI), shows excellence in real-time processing for obtaining dense disparity maps from high quality image regions that contain high density support points. In the real-time processing of high definition (HD) images, LASPI does not degrade the quality level of disparity maps compared to existing stereo-matching methods such as Efficient LArge-scale Stereo matching (ELAS). LASPI has been designed to meet a high frame-rate, accurate distance resolution performance, and a low resource usage even in a limited resource environment. These characteristics enable LASPI to be deployed to safety-critical applications such as an obstacle recognition system and distance detection system for autonomous vehicles. A Field Programmable Gate Array (FPGA) for the LASPI algorithm has been implemented in order to support parallel processing and 4-stage pipelining. From various experiments, it was verified that the developed FPGA system (Xilinx Virtex-7 FPGA, 148.5MHz Clock) is capable of processing 30 HD ($1280{\times}720pixels$) frames per second in real-time while it generates disparity maps that are applicable to real vehicles.

DESIGN AND DEVELOPMENT OF MULTI-PURPOSE CCD CAMERA SYSTEM WITH THERMOELECTRIC COOLING I. HARDWARE (열전냉각방식의 범용 CCD 카메라 시스템 개발 I. 하드웨어)

  • Kang, Y.W.;Byun, Y.I.;Rhee, J.H.;Oh, S.H.;Kim, D.K.
    • Journal of Astronomy and Space Sciences
    • /
    • v.24 no.4
    • /
    • pp.349-366
    • /
    • 2007
  • We designed and developed a multi-purpose CCD camera system for three kinds of CCDs; KAF-0401E($768{\times}512$), KAF-1602E($1536{\times}1024$), KAF-3200E($2184{\times}1472$) made by KODAK Co.. The system supports fast USB port as well as parallel port for data I/O and control signal. The packing is based on two stage circuit boards for size reduction and contains built-in filter wheel. Basic hardware components include clock pattern circuit, A/D conversion circuit, CCD data flow control circuit, and CCD temperature control unit. The CCD temperature can be controlled with accuracy of approximately $0.4^{\circ}C$ in the max. range of temperature, ${\Delta}33^{\circ}C$. This CCD camera system has with readout noise $6\;e^-$, and system gain $5\;e^-/ADU$. A total of 10 CCD camera systems were produced and our tests show that all of them show passable performance.

Conceptual Design Analysis of Satellite Communication System for KASS (KASS 위성통신시스템 개념설계 분석)

  • Sin, Cheon Sig;You, Moonhee;Hyoung, Chang-Hee;Lee, Sanguk
    • Journal of Advanced Navigation Technology
    • /
    • v.20 no.1
    • /
    • pp.8-14
    • /
    • 2016
  • High-level conceptual design analysis results of satellite communication system for Korea augmentation satellite system (KASS) satellite communication system, which is a part of KASS and consisted of KASS uplink Stations and two leased GEO is presented in this paper. We present major functions such as receiving correction and integrity message from central processing system, taking forward error correction for the message, modulating and up converting signal and conceptual design analysis for concepts for design process, GEO precise orbit determination for GEO ranging that is additional function, and clock steering for synchronization of clocks between GEO and GPS satellites. In addition to these, KASS requires 2.2 MHz for SBAS Augmentation service and 18.5 MHz for Geo-ranging service as minimum bandwidths as a results of service performance analysis of GEO ranging with respect to navigation payload(transponder) RF bandwidth is presented. These analysis results will be fed into KASS communication system design by carrying out final analysis after determining two GEOs and sites of KASS uplink stations.

A Small-area Hardware Implementation of EGML-based Moving Object Detection Processor (EGML 기반 이동객체 검출 프로세서의 저면적 하드웨어 구현)

  • Sung, Mi-ji;Shin, Kyung-wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.12
    • /
    • pp.2213-2220
    • /
    • 2017
  • This paper proposes an efficient approach for hardware implementation of moving object detection (MOD) processor using effective Gaussian mixture learning (EGML)-based background subtraction method. Arithmetic units used in background generation were implemented using LUT-based approximation to reduce hardware complexity. Hardware resources used for both background subtraction and Gaussian probability density calculation were shared. The MOD processor was verified by FPGA-in-the-loop simulation using MATLAB/Simulink. The MOD performance was evaluated by using six types of video defined in IEEE CDW-2014 dataset, which resulted the average of recall value of 0.7700, the average of precision value of 0.7170, and the average of F-measure value of 0.7293. The MOD processor was implemented with 882 slices and block RAM of $146{\times}36kbits$ on Virtex5 FPGA, resulting in 60% hardware reduction compared to conventional design based on EGML. It was estimated that the MOD processor could operate with 75 MHz clock, resulting in real-time processing of $800{\times}600$ video with a frame rate of 39 fps.