• Title/Summary/Keyword: 반복 연산

Search Result 501, Processing Time 0.029 seconds

Design of an Efficient Turbo Decoder by Initial Threshold Setting (초기 임계값 설정에 의한 효율적인 터보 복호기 설계)

  • 김동한;황선영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.5B
    • /
    • pp.582-591
    • /
    • 2001
  • 터보 부호는 반복적인 복호 알고리즘을 사용함으로써 가산성 백색 가우시안 잡음(AWGN) 채널 환경에서 Shannon 한계에 가까운 성능을 보이는 오류정정 방식으로 제안되었으나, 반복 연산량에 따른 복호 지연과 인터리버에 따른 지연에 의해 실시간 처리의 어려움이라는 문제점을 안고 있다. 본 논문에서는 터보 부호의 성능을 저하시키지 않는 범위에서 적절한 초기 임계값 설정에 따라 불필요한 반복 복호 횟수를 줄일 수 있는 터보 복호기 구조를 제안한다. 적절한 초기 임계값 설정은 LLR(Log-Likelihood Ratio)값의 평균값과 분산, 복호기의 출력에 대한 BER에 근거하여 여러 번의 모의 실험을 통해서 최적의 값으로 결정된다. 제안한 방식은 초기 임계값을 적절히 선택하면 손실이 없는 범위 내에서 반복횟수를 감소시킴으로써 기존의 정해진 반복횟수로 인한 큰 복호 지연을 미연에 방지하고, 이에 따른 계산량 감소는 저전력의 효과도 가져온다. 성능 평가를 위해 BER = $10^{-6}$이내이고, 전송속도가 32kbps 이상인 IMT2000의 고속 데이터 전송 환경에서 모의 실험을 하였다. 실험 결과로 기존의 정해진 반복횟수를 갖는 터보 복호기에 비해 SNR 변동(0~3dB)에서 평균적으로 55~90% 정도의 감소된 반복횟수를 검증하였다.

  • PDF

Study on Implementation of a High-Speed Montgomery Modular Exponentiator (고속의 몽고메리 모듈라 멱승기의 구현에 관한 연구)

  • Kim, In-Seop;Kim, Young-Chul
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11b
    • /
    • pp.901-904
    • /
    • 2002
  • 정보의 암호화와 인증, 디지털 서명등에 효율적인 공개키 암호 시스템의 주 연산은 모듈라 멱승 연산이며 이는 모듈라 곱셈의 연속적인 반복 수행으로 표현될 수 있다. 본 논문에서는 Montgomery 모듈라 곱셈 알고리즘을 사용하여 모듈라 곱셈을 효율적으로 수행하기 위한 모듈라 멱승 연산기를 구현하였으며 Montgomery 모듈라 곱셈시 발생하는 케리 진파 문제를 해결하기 위하여 CPA을 대신하는 CSA를 사용함으로써 멱승 연산시 발생하는 지연시간을 최소화시키는 결과가 얻어짐을 보였다. 본 논문에서는 Montgomery 모듈라 멱승 연산기 구현을 위하여 VHDL 구조적 모델링을 통하여 Synopsys사의 VSS와 Design analyzer를 이용한 논리 합성을 하였고 Mentor Graphics사 Model sim 및 Xilinx사 Design manager의 FPGA 시뮬레이션을 수행하여 성능을 검증 하였다.

  • PDF

A Study on the Extraction of Biosignal Paramters for the Computational Stress (연산 스트레스에 대한 감성 측정을 위한 생리 파라메터 추출에 대한 연구)

  • 하은호;김동윤;박광훈;임영훈;고한우;김동선;김승태
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 1999.11a
    • /
    • pp.139-144
    • /
    • 1999
  • 본 논문에서는 45명의 남자 대학생들에게 연산을 수행하게 한 후, 연산스트레스를 측정하기 위한 생리 파라메터의 추출에 대하여 연구하였다. 파라메터를 추출하기 위해서 1) 정규분포화를 위한 변환 2) 상관관계를 통해 상호관련성이 높은 파라메터를 조사 3) 휴식기간과 연산작업간의 파라메터의 값 비교를 통한 파라메터 표준화 4) 각 파라메터에 대해서 반복측정자료의 분산분석법을 통하여 검정함으로써 통계적으로 유의적인 차이가 있는 파라메터를 선정하였다. 위와 같은 절차를 통하여 연산스트레스의 지수화에 필요한 생리 파라메터로 Heart Rate, HRV의 LF/HF, HRV의 MF/(LF+HF), Return Map의 분산, Mean Temperature, GSR-Mean과 호흡수가 최종적으로 선정되었다.

  • PDF

A Study on GPU-based Iterative ML-EM Reconstruction Algorithm for Emission Computed Tomographic Imaging Systems (방출단층촬영 시스템을 위한 GPU 기반 반복적 기댓값 최대화 재구성 알고리즘 연구)

  • Ha, Woo-Seok;Kim, Soo-Mee;Park, Min-Jae;Lee, Dong-Soo;Lee, Jae-Sung
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.43 no.5
    • /
    • pp.459-467
    • /
    • 2009
  • Purpose: The maximum likelihood-expectation maximization (ML-EM) is the statistical reconstruction algorithm derived from probabilistic model of the emission and detection processes. Although the ML-EM has many advantages in accuracy and utility, the use of the ML-EM is limited due to the computational burden of iterating processing on a CPU (central processing unit). In this study, we developed a parallel computing technique on GPU (graphic processing unit) for ML-EM algorithm. Materials and Methods: Using Geforce 9800 GTX+ graphic card and CUDA (compute unified device architecture) the projection and backprojection in ML-EM algorithm were parallelized by NVIDIA's technology. The time delay on computations for projection, errors between measured and estimated data and backprojection in an iteration were measured. Total time included the latency in data transmission between RAM and GPU memory. Results: The total computation time of the CPU- and GPU-based ML-EM with 32 iterations were 3.83 and 0.26 see, respectively. In this case, the computing speed was improved about 15 times on GPU. When the number of iterations increased into 1024, the CPU- and GPU-based computing took totally 18 min and 8 see, respectively. The improvement was about 135 times and was caused by delay on CPU-based computing after certain iterations. On the other hand, the GPU-based computation provided very small variation on time delay per iteration due to use of shared memory. Conclusion: The GPU-based parallel computation for ML-EM improved significantly the computing speed and stability. The developed GPU-based ML-EM algorithm could be easily modified for some other imaging geometries.

Fault Analysis Attacks on Control Statement of RSA Exponentiation Algorithm (RSA 멱승 알고리즘의 제어문에 대한 오류 주입 공격)

  • Gil, Kwang-Eun;Baek, Yi-Roo;Kim, Hwan-Koo;Ha, Jae-Cheol
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.19 no.6
    • /
    • pp.63-70
    • /
    • 2009
  • Many research results show that RSA system mounted using conventional binary exponentiation algorithm is vulnerable to some physical attacks. Recently, Schmidt and Hurbst demonstrated experimentally that an attacker can exploit secret key using faulty signatures which are obtained by skipping the squaring operations. Based on similar assumption of Schmidt and Hurbst's fault attack, we proposed new fault analysis attacks which can be made by skipping the multiplication operations or computations in looping control statement. Furthermore, we applied our attack to Montgomery ladder exponentiation algorithm which was proposed to defeat simple power attack. As a result, our fault attack can extract secret key used in Montgomery ladder exponentiation.

A Study on High Speed LDPC Decoder Algorithm Based on DVB-S2 Standard (멀티미디어 기반 해상통신을 위한 DVB-S2 기반 고속 LDPC 복호를 위한 알고리즘에 관한 연구)

  • Jung, Ji Won;Kwon, Hae Chan;Kim, Yeong Ju;Park, Sang Hyuk;Lee, Seong Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38C no.3
    • /
    • pp.311-317
    • /
    • 2013
  • In this paper, we proposed high speed LDPC decoding algorithm based on DVB-S2 standard for applying marine communications in order to multimedia transmission. For implementing the high speed LDPC decoder, HSS algorithm which reduce the iteration numbers without performance degradation is applied. In HSS algorithm, check node update units are update at the same time of bit node update. HSS can be accelerated to the decoding speed because it does not need to separate calculation of the bit nodes, However, check node calculation blocks need many clocks because of just one memory is used. Therefore, this paper proposed partial memory structure in order to reduced the delay and high speed decoder is possible. The results of the simulation, when the max number of iteration set to 30 times, decoding throughput of HSS algorithm is 326 Mbit/s and decoding speed of proposed algorithm is 2.29 Gbit/s. So, decoding speed of proposed algorithm more than 7 times could be obtained compared to the HSS algorithm.

FPGA Design of SVM Classifier for Real Time Image Processing (실시간 영상처리를 위한 SVM 분류기의 FPGA 구현)

  • Na, Won-Seob;Han, Sung-Woo;Jeong, Yong-Jin
    • Journal of IKEEE
    • /
    • v.20 no.3
    • /
    • pp.209-219
    • /
    • 2016
  • SVM is a machine learning method used for image processing. It is well known for its high classification performance. We have to perform multiple MAC operations in order to use SVM for image classification. However, if the resolution of the target image or the number of classification cases increases, the execution time of SVM also increases, which makes it difficult to be performed in real-time applications. In this paper, we propose an hardware architecture which enables real-time applications using SVM classification. We used parallel architecture to simultaneously calculate MAC operations, and also designed the system for several feature extractors for compatibility. RBF kernel was used for hardware implemenation, and the exponent calculation formular included in the kernel was modified to enable fixed point modelling. Experimental results for the system, when implemented in Xilinx ZC-706 evaluation board, show that it can process 60.46 fps for $1360{\times}800$ resolution at 100MHz clock frequency.

Improvement in Inefficient Repetition of Gauss Sieve (Gauss Sieve 반복 동작에서의 비효율성 개선)

  • Byeongho Cheon;Changwon Lee;Chanho Jeon;Seokhie Hong;Suhri Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.2
    • /
    • pp.223-233
    • /
    • 2023
  • Gauss Sieve is an algorithm for solving SVP and requires exponential time and space complexity. The terminationcondition of the Sieve is determined by the size of the constructed list and the number of collisions related to space complexity. The term 'collision' refers to the state in which the sampled vector is reduced to the vector that is already inthe list. if collisions occur more than a certain number of times, the algorithm terminates. When executing previous algorithms, we noticed that unnecessary operations continued even after the shortest vector was found. This means that the existing termination condition is set larger than necessary. In this paper, after identifying the point where unnecessary operations are repeated, optimization is performed on the number of operations required. The tests are conducted by adjusting the threshold of the collision that becomes the termination condition and the distribution in whichthe sample vector is generated. According to the experiments, the operation that occupies the largest proportion decreased by62.6%. The space and time complexity also decreased by 4.3 and 1.6%, respectively.

A Proposal of Combined Iterative Algorithm for Optimal Design of Binary Phase Computer Generated Hologram (최적의 BPCGH 설계를 위한 합성 반복 알고리듬 제안)

  • Kim Cheol-Su
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.10 no.4
    • /
    • pp.16-25
    • /
    • 2005
  • In this paper, we proposed a novel algorithm combined simulated annealing and genetic algorithms for designing optimal binary phase computer generated hologram. In the process of genetic algorithm searching by block units, after the crossover and mutation operations, simulated annealing algorithm searching by pixel units is inserted. So, the performance of BPCGH was improved. Computer simulations show that the proposed combined iterative algorithm has better performance than the simulated annealing algorithm in terms of diffraction efficiency

  • PDF

QRD-RLS Algorithm Implementation Using Double Rotation CORDIC (2회전 CORDIC을 이용한 QRD-RLS 알고리듬 구현)

  • 최민호;송상섭
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.5C
    • /
    • pp.692-699
    • /
    • 2004
  • In this paper we studied an implementation of QR decomposition-based RLS algorithm using modified Givens rotation method. Givens rotation can be obtained with a sequence of the CORDIC operations. In order to reduce the computing time of QR decomposition we restricted the number of iterations of the CORDIC operation per a Givens rotation and used double-rotation method to remove the square-root in the scaling factor.