• Title/Summary/Keyword: Coprocessor

Search Result 56, Processing Time 0.021 seconds

Design of a High-Performance Information Security System-On-a-Chip using Software/Hardware Optimized Elliptic Curve Finite Field Computational Algorithms (소프트웨어/하드웨어 최적화된 타원곡선 유한체 연산 알고리즘의 개발과 이를 이용한 고성능 정보보호 SoC 설계)

  • Moon, San-Gook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.2
    • /
    • pp.293-298
    • /
    • 2009
  • In this contribution, a 193-bit elliptic curve cryptography coprocessor was implemented on an FPGA board. Optimized algorithms and numerical expressions which had been verified through C program simulation, should be analyzed again with HDL (hardware description language) such as Verilog, so that the verified ones could be modified to be applied directly to hardware implementation. The reason is that the characteristics of C programming language design is intrinsically different from the hardware design structure. The hardware IP which was double-checked in view of hardware structure together with algoritunic verification, was implemented on the Altera CycloneII FPGA device equipped with ARM9 microprocessor core, to a real chip prototype, using Altera embedded system development tool kit. The implemented finite field calculation IPs can be used as library modules as Elliptic Curve Cryptography finite field operations which has more than 193 bit key length.

Design of 32 bit Parallel Processor Core for High Energy Efficiency using Instruction-Levels Dynamic Voltage Scaling Technique

  • Yang, Yil-Suk;Roh, Tae-Moon;Yeo, Soon-Il;Kwon, Woo-H.;Kim, Jong-Dae
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.9 no.1
    • /
    • pp.1-7
    • /
    • 2009
  • This paper describes design of high energy efficiency 32 bit parallel processor core using instruction-levels data gating and dynamic voltage scaling (DVS) techniques. We present instruction-levels data gating technique. We can control activation and switching activity of the function units in the proposed data technique. We present instruction-levels DVS technique without using DC-DC converter and voltage scheduler controlled by the operation system. We can control powers of the function units in the proposed DVS technique. The proposed instruction-levels DVS technique has the simple architecture than complicated DVS which is DC-DC converter and voltage scheduler controlled by the operation system and a hardware implementation is very easy. But, the energy efficiency of the proposed instruction-levels DVS technique having dual-power supply is similar to the complicated DVS which is DC-DC converter and voltage scheduler controlled by the operation system. We simulate the circuit simulation for running test program using Spectra. We selected reduced power supply to 0.667 times of the supplied power supply. The energy efficiency of the proposed 32 bit parallel processor core using instruction-levels data gating and DVS techniques can improve about 88.4% than that of the 32 bit parallel processor core without using those. The designed high energy efficiency 32 bit parallel processor core can utilize as the coprocessor processing massive data at high speed.

Security and Privacy Mechanism using TCG/TPM to various WSN (다양한 무선네트워크 하에서 TCG/TPM을 이용한 정보보호 및 프라이버시 매커니즘)

  • Lee, Ki-Man;Cho, Nae-Hyun;Kwon, Hwan-Woo;Seo, Chang-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.5
    • /
    • pp.195-202
    • /
    • 2008
  • In this paper, To improve the effectiveness of security enforcement, the first contribution in this work is that we present a clustered heterogeneous WSN(Wareless Sensor Network) architecture, composed of not only resource constrained sensor nodes, but also a number of more powerful high-end devices acting as cluster heads. Compared to sensor nodes, a high-end cluster head has higher computation capability, larger storage, longer power supply, and longer radio transmission range, and it thus does not suffer from the resource scarceness problem as much as a sensor node does. A distinct feature of our heterogeneous architecture is that cluster heads are equipped with TC(trusted computing) technology, and in particular a TCG(Trusted Computing Group) compliant TPM (Trusted Platform Module) is embedded into each cluster head. According the TCG specifications, TPM is a tamper-resistant, self-contained secure coprocessor, capable of performing cryptographic functions. A TPM attached to a host establishes a trusted computing platform that provides sealed storage, and measures and reports the integrity state of the platform.

  • PDF

Comparison of Parallel Computation Performances for 3D Wave Propagation Modeling using a Xeon Phi x200 Processor (제온 파이 x200 프로세서를 이용한 3차원 음향 파동 전파 모델링 병렬 연산 성능 비교)

  • Lee, Jongwoo;Ha, Wansoo
    • Geophysics and Geophysical Exploration
    • /
    • v.21 no.4
    • /
    • pp.213-219
    • /
    • 2018
  • In this study, we simulated 3D wave propagation modeling using a Xeon Phi x200 processor and compared the parallel computation performance with that using a Xeon CPU. Unlike the 1st generation Xeon Phi coprocessor codenamed Knights Corner, the 2nd generation x200 Xeon Phi processor requires no additional communication between the internal memory and the main memory since it can run an operating system directly. The Xeon Phi x200 processor can run large-scale computation independently, with the large main memory and the high-bandwidth memory. For comparison of parallel computation, we performed the modeling using the MPI (Message Passing Interface) and OpenMP (Open Multi-Processing) libraries. Numerical examples using the SEG/EAGE salt model demonstrated that we can achieve 2.69 to 3.24 times faster modeling performance using the Xeon Phi with a large number of computational cores and high-bandwidth memory compared to that using the 12-core CPU.

Efficient Hardware Implementation of ${\eta}_T$ Pairing Based Cryptography (${\eta}_T$ Pairing 알고리즘의 효율적인 하드웨어 구현)

  • Lee, Dong-Geoon;Lee, Chul-Hee;Choi, Doo-Ho;Kim, Chul-Su;Choi, Eun-Young;Kim, Ho-Won
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.20 no.1
    • /
    • pp.3-16
    • /
    • 2010
  • Recently in the field of the wireless sensor network, many researchers are attracted to pairing cryptography since it has ability to distribute keys without additive communication. In this paper, we propose efficient hardware implementation of ${\eta}_T$ pairing which is one of various pairing scheme. we suggest efficient hardware architecture of ${\eta}_T$ pairing based on parallel processing and register/resource optimization, and then we present the result of our FPGA implementation over GF($2^{239}$). Our implementation gives 15% better result than others in Area Time Product.

Memory Organization for a Fuzzy Controller.

  • Jee, K.D.S.;Poluzzi, R.;Russo, B.
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1993.06a
    • /
    • pp.1041-1043
    • /
    • 1993
  • Fuzzy logic based Control Theory has gained much interest in the industrial world, thanks to its ability to formalize and solve in a very natural way many problems that are very difficult to quantify at an analytical level. This paper shows a solution for treating membership function inside hardware circuits. The proposed hardware structure optimizes the memoried size by using particular form of the vectorial representation. The process of memorizing fuzzy sets, i.e. their membership function, has always been one of the more problematic issues for the hardware implementation, due to the quite large memory space that is needed. To simplify such an implementation, it is commonly [1,2,8,9,10,11] used to limit the membership functions either to those having triangular or trapezoidal shape, or pre-definite shape. These kinds of functions are able to cover a large spectrum of applications with a limited usage of memory, since they can be memorized by specifying very few parameters ( ight, base, critical points, etc.). This however results in a loss of computational power due to computation on the medium points. A solution to this problem is obtained by discretizing the universe of discourse U, i.e. by fixing a finite number of points and memorizing the value of the membership functions on such points [3,10,14,15]. Such a solution provides a satisfying computational speed, a very high precision of definitions and gives the users the opportunity to choose membership functions of any shape. However, a significant memory waste can as well be registered. It is indeed possible that for each of the given fuzzy sets many elements of the universe of discourse have a membership value equal to zero. It has also been noticed that almost in all cases common points among fuzzy sets, i.e. points with non null membership values are very few. More specifically, in many applications, for each element u of U, there exists at most three fuzzy sets for which the membership value is ot null [3,5,6,7,12,13]. Our proposal is based on such hypotheses. Moreover, we use a technique that even though it does not restrict the shapes of membership functions, it reduces strongly the computational time for the membership values and optimizes the function memorization. In figure 1 it is represented a term set whose characteristics are common for fuzzy controllers and to which we will refer in the following. The above term set has a universe of discourse with 128 elements (so to have a good resolution), 8 fuzzy sets that describe the term set, 32 levels of discretization for the membership values. Clearly, the number of bits necessary for the given specifications are 5 for 32 truth levels, 3 for 8 membership functions and 7 for 128 levels of resolution. The memory depth is given by the dimension of the universe of the discourse (128 in our case) and it will be represented by the memory rows. The length of a world of memory is defined by: Length = nem (dm(m)+dm(fm) Where: fm is the maximum number of non null values in every element of the universe of the discourse, dm(m) is the dimension of the values of the membership function m, dm(fm) is the dimension of the word to represent the index of the highest membership function. In our case then Length=24. The memory dimension is therefore 128*24 bits. If we had chosen to memorize all values of the membership functions we would have needed to memorize on each memory row the membership value of each element. Fuzzy sets word dimension is 8*5 bits. Therefore, the dimension of the memory would have been 128*40 bits. Coherently with our hypothesis, in fig. 1 each element of universe of the discourse has a non null membership value on at most three fuzzy sets. Focusing on the elements 32,64,96 of the universe of discourse, they will be memorized as follows: The computation of the rule weights is done by comparing those bits that represent the index of the membership function, with the word of the program memor . The output bus of the Program Memory (μCOD), is given as input a comparator (Combinatory Net). If the index is equal to the bus value then one of the non null weight derives from the rule and it is produced as output, otherwise the output is zero (fig. 2). It is clear, that the memory dimension of the antecedent is in this way reduced since only non null values are memorized. Moreover, the time performance of the system is equivalent to the performance of a system using vectorial memorization of all weights. The dimensioning of the word is influenced by some parameters of the input variable. The most important parameter is the maximum number membership functions (nfm) having a non null value in each element of the universe of discourse. From our study in the field of fuzzy system, we see that typically nfm 3 and there are at most 16 membership function. At any rate, such a value can be increased up to the physical dimensional limit of the antecedent memory. A less important role n the optimization process of the word dimension is played by the number of membership functions defined for each linguistic term. The table below shows the request word dimension as a function of such parameters and compares our proposed method with the method of vectorial memorization[10]. Summing up, the characteristics of our method are: Users are not restricted to membership functions with specific shapes. The number of the fuzzy sets and the resolution of the vertical axis have a very small influence in increasing memory space. Weight computations are done by combinatorial network and therefore the time performance of the system is equivalent to the one of the vectorial method. The number of non null membership values on any element of the universe of discourse is limited. Such a constraint is usually non very restrictive since many controllers obtain a good precision with only three non null weights. The method here briefly described has been adopted by our group in the design of an optimized version of the coprocessor described in [10].

  • PDF