• Title/Summary/Keyword: 반복 연산

Search Result 500, Processing Time 0.03 seconds

Performance Evaluation and Implementation of Rank-Order Filter Using Neural Networks (신경회로망을 이용한 Rank-Order 필터의 구현과 성능 평가)

  • Yoon, Sook;Park, Dong-Sun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.6B
    • /
    • pp.794-801
    • /
    • 2001
  • 본 논문에서는 rank-order 필터의 구현을 위해 세 가지 신경회로망의 구조를 제시하고 분석하며 용도를 제안한다. 첫 번째 신경회로망을 이용하여 2-입력 정렬기를 제안하고 이를 이용하여 계층적인 N-입력 정렬기를 구성한다. 두 번째로 입력 신호간의 상대적인 크기 정보를 이용하여 학습 패턴을 구성한 후 역전파 학습 기법을 이용하여 구현되는 순방향 신경회로망을 이용한 rank-order 필터를 구현한다. 세 번째로 신경회로망의 구조의 출력층에 외부 입력으로 순위 정보를 가지도록 하는 rank-order 필터를 순방향 신경회로망을 이용하여 구현한다. 그리고 이러한 제안된 기술들에 대해 확장성, 구조의 복잡도와 시간 지연 등에서의 성능을 비교, 평가한다. 2-입력 정렬기를 이용하는 방식은 확장이 용이하고 비교적 구조가 간단하나 입력 신호들의 정렬을 위해 신경회로망은 순환하는 구조를 가지며 입력 신호의 수에 비례하는 반복 연산 후에 결과를 얻게 된다. 반면에, 순방향 신경회로망을 이용한 rank-order 필터의 구현 방식은 이러한 반복 연산으로 인한 시간 지연을 줄일 수 있으나 상대적으로 복잡한 구조를 가진다.

  • PDF

Preliminary Study on the Enhancement of Reconstruction Speed for Emission Computed Tomography Using Parallel Processing (병렬 연산을 이용한 방출 단층 영상의 재구성 속도향상 기초연구)

  • Park, Min-Jae;Lee, Jae-Sung;Kim, Soo-Mee;Kang, Ji-Yeon;Lee, Dong-Soo;Park, Kwang-Suk
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.43 no.5
    • /
    • pp.443-450
    • /
    • 2009
  • Purpose: Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. Materials and Methods: The preliminary tests for the possibility on virtual manchines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Results: Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Conclusion: Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify.

An Addition-Chain Heuristics and Two Modular Multiplication Algorithms for Fast Modular Exponentiation (모듈라 멱승 연산의 빠른 수행을 위한 덧셈사슬 휴리스틱과 모듈라 곱셈 알고리즘들)

  • 홍성민;오상엽;윤현수
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.7 no.2
    • /
    • pp.73-92
    • /
    • 1997
  • A modular exponentiation( E$M^{$=varepsilon$}$mod N) is one of the most important operations in Public-key cryptography. However, it takes much time because the modular exponentiation deals with very large operands as 512-bit integers. Modular exponentiation is composed of repetition of modular multiplications, and the number of repetition is the same as the length of the addition-chain of the exponent(E). Therefore, we can reduce the execution time of modular exponentiation by finding shorter addition-chain(i.e. reducing the number of repetitions) or by reducing the execution time of each modular multiplication. In this paper, we propose an addition-chain heuristics and two fast modular multiplication algorithms. Of two modular multiplication algorithms, one is for modular multiplication between different integers, and the other is for modular squaring. The proposed addition-chain heuristics finds the shortest addition-chain among exisiting algorithms. Two proposed modular multiplication algorithms require single-precision multiplications fewer than 1/2 times of those required for previous algorithms. Implementing on PC, proposed algorithms reduce execution times by 30-50% compared with the Montgomery algorithm, which is the best among previous algorithms.

A Design of High-speed Phase Calculator for 3D Depth Image Extraction from TOF Sensor Data (TOF 센서용 3차원 Depth Image 추출을 위한 고속 위상 연산기 설계)

  • Koo, Jung-Youn;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.2
    • /
    • pp.355-362
    • /
    • 2013
  • A hardware implementation of phase calculator for extracting 3D depth image from TOF(Time-Of-Flight) sensor is described. The designed phase calculator, which adopts a pipelined architecture to improve throughput, performs arctangent operation using vectoring mode of CORDIC algorithm. Fixed-point MATLAB modeling and simulations are carried out to determine the optimized bit-widths and number of iteration. The designed phase calculator is verified by FPGA-in-the-loop verification using MATLAB/Simulink, and synthesized with a TSMC 0.18-${\mu}m$ CMOS cell library. It has 16,000 gates and the estimated throughput is about 9.6 Gbps at 200Mhz@1.8V.

A study on the design of general division operator for the divisor with a small number in RNS (소(少) 제수용 잉여수계 제산 연산기 설계에 관한 연구)

  • Kim, Yong-Sung
    • The Journal of Information Technology
    • /
    • v.7 no.2
    • /
    • pp.19-28
    • /
    • 2004
  • Many kind of operators using Residue Number System are used to design the special purpose processor for many merits in Digital Signal Processing, Computer Graphics, etc. But It get demerits for general division and the magnitude comparison. In this paper, general division operator for divisor with a small number in RNS is proposed. If the result of division using the multiplicative inverse has remainder, the quotient of this is larger than maximum quotient of division that has the same divisor to dividend of the maximum size. This condition is used for the ending condition of the recursive operation. And, the divisor is substitute for the compared value of quotients. So, the proposed division operator has a small size and fine operation speed, but with the limitation of divisor.

  • PDF

A New Pipelined Divider with a Small Lookup Table (작은 룩업테이블을 가지는 새로운 파이프라인 나눗셈기)

  • Jeong, Woong;Park, Woo-Chan;Kwak, Sung-Ho;Yang, Hoon-Mo;Jeong, Cheol-Ho;Han, Tack-Don;Lee, Moon-Key
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.9
    • /
    • pp.724-733
    • /
    • 2003
  • Generally, dividers have been designed to use iteration, but recently the research on the pipelined divider is underway. It is a difficult point in the known pipelined division unit that a large lookup table is required. In this paper, the cost-effective pipelined divider is proposed, that needs a lookup table smaller than that of the other pipelined divider. The latency of the proposed divider is 3 cycles. We obtain a 30% reduced area than that of P. Hung.

An Efficient Bit-serial Systolic Multiplier over GF($2^m$) (GF($2^m$)상의 효율적인 비트-시리얼 시스톨릭 곱셈기)

  • Lee Won-Ho;Yoo Kee-Young
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.1_2
    • /
    • pp.62-68
    • /
    • 2006
  • The important arithmetic operations over finite fields include multiplication and exponentiation. An exponentiation operation can be implemented using a series of squaring and multiplication operations over GF($2^m$) using the binary method. Hence, it is important to develop a fast algorithm and efficient hardware for multiplication. This paper presents an efficient bit-serial systolic array for MSB-first multiplication in GF($2^m$) based on the polynomial representation. As compared to the related multipliers, the proposed systolic multiplier gains advantages in terms of input-pin and area-time complexity. Furthermore, it has regularity, modularity, and unidirectional data flow, and thus is well suited to VLSI implementation.

Design of VLSI Architecture for Efficient Exponentiation on $GF(2^m)$ ($GF(2^m)$ 상에서의 효율적인 지수제곱 연산을 위한 VLSI Architecture 설계)

  • 한영모
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.41 no.6
    • /
    • pp.27-35
    • /
    • 2004
  • Finite or Galois fields have been used in numerous applications such as error correcting codes, digital signal processing and cryptography. These applications often require exponetiation on GF(2$^{m}$ ) which is a very computationally intensive operation. Most of the existing methods implemented the exponetiation by iterative methods using repeated multiplications, which leads to much computational load, or needed much hardware cost because of their structural complexity in implementing. In this paper, we present an effective VLSI architecture for exponentiation on GF(2$^{m}$ ). This circuit computes the exponentiation by multiplying product terms, each of which corresponds to an exponent bit. Until now use of this type algorithm has been confined to a primitive element but we generalize it to any elements in GF(2$^{m}$ ).

A Resource-Constrained Scheduling Algorithm for High Level Synthesis (상위레벨 회로합성을 위한 자원제한 스케줄링 알고리즘)

  • Hwang In-Jae
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.6 no.1
    • /
    • pp.39-44
    • /
    • 2005
  • Scheduling for digital system synthesis is assigning each operation in a control/data flow graph(CDFG) to a specific control step without violating precedence relation. It is one of the most important tasks due to its direct influence on the performance of the hardware synthesized. In this paper, we propose a resource-constrained scheduling algorithm. Our algorithm first analyzes the given CDFG to determine the number of functional units of each type, then assigns each operation to a control step while satisfying the constraints. It also tries to improve the solution iteratively by adjusting the number of functional units using the results collected from the previous scheduling. Experiments were performed to test the performance of the proposed algorithm, and results are presented

  • PDF

Simplified MMSE Detection with SoIC for Iterative Receivers in Multiple Antenna Systems (다중 안테나 시스템에서 연 간섭 제거를 이용한 저 복잡도 MMSE 신호 검출 방법)

  • Kim, Jong-Kyung;Seo, Jong-Soo
    • Journal of Advanced Navigation Technology
    • /
    • v.13 no.3
    • /
    • pp.385-392
    • /
    • 2009
  • Simplified minimum mean square error (MMSE) detection technique combined with soft interference cancellation(SoIC) is proposed for iterative receivers in multiple antenna systems. To avoid repeated matrix inversions required to obtain the MMSE filter coefficients during the iteration between the soft detector and decoder, simplified matrix inversion techniques are applied to calculate the filter coefficient matrix. Simulation results show that the proposed MMSE detections with SoIC indicate a comparable or slightly degraded detection performance while achieving a significantly reduced complexity as compared to the conventional MMSE detection with SoIC.

  • PDF