• Title/Summary/Keyword: semi-parallel architecture

Search Result 12, Processing Time 0.032 seconds

Architecture of an LDPC Decoder for DVB-S2 using reuse Technique of processing units and Memory Relocation (연산기와 메모리 재사용을 이용한 효율적인 DVB-S2 규격의 LDPC 복호기 구조)

  • Park Jae-Geun;Lee Chan-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.9 s.351
    • /
    • pp.31-37
    • /
    • 2006
  • Low-density parity-check (LDPC) codes are recently emerged due to its excellent performance. The standard for European high definition satellite digital video broadcast, DVB-S2 has adopted LDPC codes as a channel coding scheme. This paper proposes a DVB-S2 LDPC decoder architecture using a hybrid parity check matrix which is efficient in hardware implementation for both decoders and encoders. The hybrid H-matrices are constructed so that both the semi-random technique and the partly parallel structure can be applied to design encoders and decoders. Using the hybrid H-matrix scheme, the architecture of LDPC decoder for DVB-S2 can be very practical and efficient. In addition, we show a new Variable Node processor Unit (VNU) architecture to reuse the VNU for various code rates and optimized block memory placement to reuse. We design a DVB-S2 LDPC decoder of code rate 1/2 usng the proposed architecture. We estimate the performance of the DVB-S2 LDPC decoder and compare it with other decoders.

Construction of Semi-Algebra Low Density Parity Check Codes for Parallel Array Processing (병렬 어레이 프로세싱을 위한 반집합 대수 LDPC 부호의 구성)

  • Lee Kwang-jae;Lee Moon-ho;Lee Dong-min
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.1C
    • /
    • pp.1-8
    • /
    • 2005
  • In this paper, we present a novel LDPC code construction called as semi-algebra low density parity check(LDPC) codes which is one kind of deterministic LDPC code based on dual-diagonal sub-matrix. The constructing method results in a class of high rate LDPC codes. Codes in this class have a large girth and good minimum distances. Furthermore, they can be implemented by simple parallel array architecture using cyclic shift register and perform well with the iterative decoding.

Fast Multi-Rate LDPC Encoder Architecture for WiBro System (WiBro 시스템을 위한 고속 LDPC 인코더 설계)

  • Kim, Jeong-Ki;S.P., Balakannan;Lee, Moon-Ho
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.45 no.7
    • /
    • pp.1-8
    • /
    • 2008
  • Low Density Parity Check codes(LDPC) are recently focused on communication systems due to its good performance. The standard of WiBro has also included LDPC codes as a channel coding. The weak point of implementation for LDPC encoder is that conventional binary Matrix Vector Multiplier has many clock cycles which limit throughput. In this paper, we propose semi-parallel architecture by using cyclic shift registers and exclusive-OR without conventional Matrix Vector Multipliers over the standard parity check matrices with Circulant Permutation Matrices(CPM). Furthermore, multi-rate encoder is designed by using proposed architecture. Our encoder with multi-rate for IEEE 802.16e LDPC has lower clock cycles and higher throughput.

$AB^2$ Semi-systolic Architecture over GF$GF(2^m)$ ($GF(2^m)$상에서 $AB^2$ 연산을 위한 세미시스톨릭 구조)

  • 이형목;전준철;유기영;김현성
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.2
    • /
    • pp.45-52
    • /
    • 2002
  • In this contributions, we propose a new MSB(most significant bit) algorithm based on AOP(All One Polynomial) and two parallel semi-systolic architectures to computes $AB^2$over finite field $GF(2^m)$. The proposed architectures are based on standard basis and use the property of irreducible AOP(All One Polynomial) which is all coefficients of 1. The proposed parallel semi-systolic architecture(PSM) has the critical path of $D_{AND2^+}D_{XOR2}$ per cell and the latency of m+1. The modified parallel semi-systolic architecture(WPSM) has the critical path of $D_{XOR2}$ per cell and has the same latency with PSM. The proposed two architectures, PSM and MPSM, have a low latency and a small hardware complexity compared to the previous architectures. They can be used as a basic architecture for exponentiation, division, and inversion. Since the proposed architectures have regularity, modularity and concurrency, they are suitable for VLSI implementation. They can be used as a basic architecture for algorithms, such as the Diffie-Hellman key exchange scheme, the Digital Signature Algorithm(DSA), and the ElGamal encryption scheme which are needed exponentiation operation. The application of the algorithms can be used cryptosystem implementation based on elliptic curve.

Design of Montgomery Algorithm and Hardware Architecture over Finite Fields (유한 체상의 몽고메리 알고리즘 및 하드웨어 구조 설계)

  • Kim, Kee-Won;Jeon, Jun-Cheol
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.18 no.2
    • /
    • pp.41-46
    • /
    • 2013
  • Finite field multipliers are the basic building blocks in many applications such as error-control coding, cryptography and digital signal processing. Recently, many semi-systolic architectures have been proposed for multiplications over finite fields. Also, Montgomery multiplication algorithm is well known as an efficient arithmetic algorithm. In this paper, we induce an efficient multiplication algorithm and propose an efficient semi-systolic Montgomery multiplier based on polynomial basis. We select an ideal Montgomery factor which is suitable for parallel computation, so our architecture is divided into two parts which can be computed simultaneously. In analysis, our architecture reduces 30%~50% of time complexity compared to typical architectures.

An Extended Evaluation Algorithm in Parallel Deductive Database (병렬 연역 데이타베이스에서 확장된 평가 알고리즘)

  • Jo, U-Hyeon;Kim, Hang-Jun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.7
    • /
    • pp.1680-1686
    • /
    • 1996
  • The deterministic update method of intensional predicates in a parallel deductive database that deductive database is distributed in a parallel computer architecture in needed. Using updated data from the deterministic update method, a strategy for parallel evaluation of intensional predicates is required. The paper is concerned with an approach to updating parallel deductive database in which very insertion or deletion can be performed in a deterministic way, and an extended parallel semi-naive evaluation algorithm in a parallel computer architecture. After presenting an approach to updating intensional predicates and strategy for parallel evaluation, its implementation is discussed. A parallel deductive database consists of the set of facts being the extensional database and the set of rules being the intensional database. We assume that these sets are distributed in each processor, research how to update intensional predicates and evaluate using the update method. The parallel architecture for the deductive database consists of a set of processors and a message passing network to interconnect these processors.

  • PDF

Area-Efficient Semi-Parallel Encoding Structure for Long Polar Codes (긴 극 부호를 위한 저 면적 부분 병렬 극 부호 부호기 설계)

  • Shin, Yerin;Choi, Soyeon;Yoo, Hoyoung
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1288-1294
    • /
    • 2019
  • The channel-achieving property made the polar code show to advantage as an error-correcting code. However, sufficient error-correction performance shows the asymptotic property that is achieved when the length of the code is long. Therefore, efficient architecture is needed to realize the implementation of very-large-scale integration for the case of long input data. Although the most basic fully parallel encoder is intuitive and easy to implement, it is not suitable for long polar codes because of the high hardware complexity. Complementing this, a partially parallel encoder was proposed which has an excellent result in terms of hardware area. Nevertheless, this method has not been completely generalized and has the disadvantage that different architectures appear depending on the hardware designer. In this paper, we propose a hardware design scheme that applies the proposed systematic approach which is optimized for bit-dimension permutations. By applying this solution, it is possible to design a generalized partially parallel encoder for long polar codes with the same intuitive architecture as a fully parallel encoder.

Architecture of General and Intelligent Parallel Processing System (범용성과 지능성을 갖는 병렬 처리기 구조)

  • Lee, Hyung;Choi, Sung-Hyuk;Kim, Jung-Bae;Park, Jong-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2000.10a
    • /
    • pp.601-604
    • /
    • 2000
  • 본 논문에서는 방대한 양의 영상데이터를 실시간으로 처리하기 위해 제안된 Park's 다중접근 기억장치를 이용한 SIMD 병렬 처리기 시스템의 효율성을 높이기 위하여 Semi-MIMD 구조를 갖는 병렬처리기 시스템을 제안한다.

  • PDF

A comparison study of water impact and water exit models

  • Korobkin, Alexander;Khabakhpasheva, Tatyana;Malenica, Sime;Kim, Yonghwan
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.6 no.4
    • /
    • pp.1182-1196
    • /
    • 2014
  • In problems of global hydroelastic ship response in severe seas including the whipping problem, we need to know the hydrodynamic forces acting on the ship hull during almost arbitrary ship motions. In terms of ship sections, some of them can enter water but others exit from water. Computations of nonlinear free surface flows, pressure distributions and hydrodynamic forces in parallel with the computations of the ship motions including elastic vibrations of the ship hull are time consuming and are suitable only for research purposes but not for practical calculations. In this paper, it is shown that the slamming forces can be decomposed in two components within three semi-analytical models of water entry. Only heave motion is considered. The first component is proportional to the entry speed squared and the second one to the body acceleration. The coefficients in these two components are functions of the penetration depth only and can be precomputed for given shape of the body. During the exit stage the hydrodynamic force is proportional to the acceleration of the body and independent of the body shape for bodies with small deadrise angles.

Algorithmic GPGPU Memory Optimization

  • Jang, Byunghyun;Choi, Minsu;Kim, Kyung Ki
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.14 no.4
    • /
    • pp.391-406
    • /
    • 2014
  • The performance of General-Purpose computation on Graphics Processing Units (GPGPU) is heavily dependent on the memory access behavior. This sensitivity is due to a combination of the underlying Massively Parallel Processing (MPP) execution model present on GPUs and the lack of architectural support to handle irregular memory access patterns. Application performance can be significantly improved by applying memory-access-pattern-aware optimizations that can exploit knowledge of the characteristics of each access pattern. In this paper, we present an algorithmic methodology to semi-automatically find the best mapping of memory accesses present in serial loop nest to underlying data-parallel architectures based on a comprehensive static memory access pattern analysis. To that end we present a simple, yet powerful, mathematical model that captures all memory access pattern information present in serial data-parallel loop nests. We then show how this model is used in practice to select the most appropriate memory space for data and to search for an appropriate thread mapping and work group size from a large design space. To evaluate the effectiveness of our methodology, we report on execution speedup using selected benchmark kernels that cover a wide range of memory access patterns commonly found in GPGPU workloads. Our experimental results are reported using the industry standard heterogeneous programming language, OpenCL, targeting the NVIDIA GT200 architecture.