• Title/Summary/Keyword: Decoding throughput

Search Result 90, Processing Time 0.021 seconds

A Study on High Speed LDPC Decoder Algorithm based on dc saperation (dc 분리 기반의 고속 LDPC 복호 알고리즘에 관한 연구)

  • Kwon, Hae-Chan;Kim, Tae-Hoon;Jung, Ji-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.9
    • /
    • pp.2041-2047
    • /
    • 2013
  • In this paper, we proposed high speed LDPC decoding algorithm based on DVB-S2 standard. For implementing the high speed LDPC decoder, HSS algorithm which reduce the iteration numbers without performance degradation is applied. In HSS algorithm, check node update units are update at the same time of bit node update. HSS can be accelerated to the decoding speed because it does not need to separate calculation of the bit nodes, However, check node calculation blocks need many clocks because of just one memory is used. Therefore, this paper proposed dc-split memory structure in order to reduced the delay and high speed decoder is possible. Finally, this paper presented maximum split memory and throughput for various coding rates in DVB-S2 standard.

On Designing 4-way Superscalar Digital Signal Processor Core (4-way 수퍼 스칼라 디지털 시그널 프로세서 코어 설계)

  • 김준석;유선국;박성욱;정남훈;고우석;이근섭;윤대희
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.6
    • /
    • pp.1409-1418
    • /
    • 1998
  • The recent audio CODEC(Coding/Decoding) algorithms are complex of several coding techniques, and can be divided into DSP tasks, controller tasks and mixed tasks. The traditional DSP processor has been designed for fast processing of DSP tasks only, but not for controller and mixed tasks. This paper presents a new architecture that achieves high throughput on both controller and mixed tasks of such algorithms while maintaining high performance for DSP tasks. The proposed processor, YSP-3, operates four algorithms while maintaining high performance for DSP tasks. The proposed processor, YSP-3, operates functional units (Multiplier, two ALUs, Load/Store Unit) in parallel via 4-issue super-scalar instruction structure. The performance evaluation of YSP-3 has been done through the implementation of the several DSP algorithms and the part of the AC-3 decoding algorithms.

  • PDF

Code Rate 1/2, 2304-b LDPC Decoder for IEEE 802.16e WiMAX (IEEE 802.16e WiMAX용 부호율 1/2, 2304-비트 LDPC 복호기)

  • Kim, Hae-Ju;Shin, Kyung-Wook
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.4A
    • /
    • pp.414-422
    • /
    • 2011
  • This paper describes a design of low-density parity-check(LDPC) decoder supporting block length 2,304-bit and code rate 1/2 of IEEE 802.16e mobile WiMAX standard. The designed LDPC decoder employs the min-sum algorithm and partially parallel layered-decoding architecture which processes a sub-matrix of $96{\times}96$ in parallel. By exploiting the properties of the min-sum algorithm, a new memory reduction technique is proposed, which reduces check node memory by 46% compared to conventional method. Functional verification results show that it has average bit-error-rate(BER) of $4.34{\times}10^{-5}$ for AWGN channel with Fb/No=2.1dB. Our LDPC decoder synthesized with a $0.18{\mu}m$ CMOS cell library has 174,181 gates and 52,992 bits memory, and the estimated throughput is about 417 Mbps at 100-MHz@l.8-V.

A Study of Efficient Viterbi Equalizer in FTN Channel (FTN 채널에서의 효율적인 비터비 등화기 연구)

  • Kim, Tae-Hun;Lee, In-Ki;Jung, Ji-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.6
    • /
    • pp.1323-1329
    • /
    • 2014
  • In this paper, we analyzed efficient decoding scheme with FTN (Faster than Nyquist) method that is transmission method faster than Nyquist theory and increase the throughput. we proposed viterbi equalizer model to minimize ISI (Inter-Symbol Interference) when FTN signal is transmitted. the proposed model utilized interference as branch information. In this paper, to decode FTN singal, we used turbo equalization algorithms that iteratively exchange probabilistic information between soft Viterbi equalizer (BCJR method) and LDPC decoder. By changing the trellis diagram in order to maximize Euclidean distance, we confirmed that performance was improved compared to conventional methods as increasing throughput of FTN signal.

Substream-based out-of-sequence packet scheduling for streaming stored media (저장매체 스트리밍에서 substream에 기초한 비순차 패킷 스케줄링)

  • Choi Su Jeong;Ahn Hee June;Kang Sang Hyuk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.10C
    • /
    • pp.1469-1483
    • /
    • 2004
  • We propose a packet scheduling algorithms for streaming media. We assume that the receiver periodically reports back the channel throughput. From the original video data, the importance level of a video packet is determined by its relative position within its group of pictures, taking into account the motion-texture discrimination and temporal scalability. Thus, we generate a number of nested substreams. Using feedback information from the receiver and statistical characteristics of the video, we model the streaming system as a queueing system, compute the run-time decoding failure probability of a Same in each substream based on effective bandwidth approach, and determine the optimum substream to be sent at that moment in time. Since the optimum substream is updated periodically, the resulting sending order is different from the original playback order. From experiments with real video data, we show that our proposed scheduling scheme outperforms the conventional sequential sending scheme.

A Versatile Reed-Solomon Decoder for Continuous Decoding of Variable Block-Length Codewords (가변 블록 길이 부호어의 연속 복호를 위한 가변형 Reed-Solomon 복호기)

  • 송문규;공민한
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.41 no.3
    • /
    • pp.187-187
    • /
    • 2004
  • In this paper, we present an efficient architecture of a versatile Reed-Solomon (RS) decoder which can be programmed to decode RS codes continuously with my message length k as well as any block length n. This unique feature eliminates the need of inserting zeros for decoding shortened RS codes. Also, the values of the parameters n and k, hence the error-correcting capability t can be altered at every codeword block. The decoder permits 3-step pipelined processing based on the modified Euclid's algorithm (MEA). Since each step can be driven by a separate clock, the decoder can operate just as 2-step pipeline processing by employing the faster clock in step 2 and/or step 3. Also, the decoder can be used even in the case that the input clock is different from the output clock. Each step is designed to have a structure suitable for decoding RS codes with varying block length. A new architecture for the MEA is designed for variable values of the t. The operating length of the shift registers in the MEA block is shortened by one, and it can be varied according to the different values of the t. To maintain the throughput rate with less circuitry, the MEA block uses both the recursive technique and the over-clocking technique. The decoder can decodes codeword received not only in a burst mode, but also in a continuous mode. It can be used in a wide range of applications because of its versatility. The adaptive RS decoder over GF($2^8$) having the error-correcting capability of upto 10 has been designed in VHDL, and successfully synthesized in an FPGA chip.

A Versatile Reed-Solomon Decoder for Continuous Decoding of Variable Block-Length Codewords (가변 블록 길이 부호어의 연속 복호를 위한 가변형 Reed-Solomon 복호기)

  • 송문규;공민한
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.41 no.3
    • /
    • pp.29-38
    • /
    • 2004
  • In this paper, we present an efficient architecture of a versatile Reed-Solomon (RS) decoder which can be programmed to decode RS codes continuously with my message length k as well as any block length n. This unique feature eliminates the need of inserting zeros for decoding shortened RS codes. Also, the values of the parameters n and k, hence the error-correcting capability t can be altered at every codeword block. The decoder permits 3-step pipelined processing based on the modified Euclid's algorithm (MEA). Since each step can be driven by a separate clock, the decoder can operate just as 2-step pipeline processing by employing the faster clock in step 2 and/or step 3. Also, the decoder can be used even in the case that the input clock is different from the output clock. Each step is designed to have a structure suitable for decoding RS codes with varying block length. A new architecture for the MEA is designed for variable values of the t. The operating length of the shift registers in the MEA block is shortened by one, and it can be varied according to the different values of the t. To maintain the throughput rate with less circuitry, the MEA block uses both the recursive technique and the over-clocking technique. The decoder can decodes codeword received not only in a burst mode, but also in a continuous mode. It can be used in a wide range of applications because of its versatility. The adaptive RS decoder over GF(2$^{8}$ ) having the error-correcting capability of upto 10 has been designed in VHDL, and successfully synthesized in an FPGA chip.

Performance analysis of Downlink Multi-user MIMO based on TM9 in Rel.10 (Rel.10 의 TM9 기반 다운링크 Multi-user MIMO 성능분석)

  • Song, Hua Yue;Choi, Seung Won
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.10 no.1
    • /
    • pp.47-53
    • /
    • 2014
  • LTE-Advanced is the evolved version of LTE which is currently in progress at the 3GPP. At present, as the number of smart phone users is rapidly increasing, the demand for ever more capacity is driven largely by video usage and high quality data communication and so on, this let more researchers study about LTE-A all over the world. LTE-A aims to achieve improved service and communication quality over 3G system at the aspect of throughput, peak data rate, latency, and spectral efficiency. Among various features in LTE-A, the multi-user MIMO(MU-MIMO), in which the base station transmits several streams to multiple receivers, has expected to give better quality for system. In this paper, we investigate performances of various types of downlink receivers with fixed number of antennas. we first review the development process from LTE to LTE-A. Second we introduce TM9 which is adopted in Rel.10 for MU-MIMO system, including the MU-MIMO system model and the explanation on the algorithm used in system. We also have brief introduction about sub-blocking in turbo decoding, finally we compare the performance between the uncoded case and coded case which is using turbo encoding.

An Efficient Block Cipher Implementation on Many-Core Graphics Processing Units

  • Lee, Sang-Pil;Kim, Deok-Ho;Yi, Jae-Young;Ro, Won-Woo
    • Journal of Information Processing Systems
    • /
    • v.8 no.1
    • /
    • pp.159-174
    • /
    • 2012
  • This paper presents a study on a high-performance design for a block cipher algorithm implemented on modern many-core graphics processing units (GPUs). The recent emergence of VLSI technology makes it feasible to fabricate multiple processing cores on a single chip and enables general-purpose computation on a GPU (GPGPU). The GPU strategy offers significant performance improvements for all-purpose computation and can be used to support a broad variety of applications, including cryptography. We have proposed an efficient implementation of the encryption/decryption operations of a block cipher algorithm, SEED, on off-the-shelf NVIDIA many-core graphics processors. In a thorough experiment, we achieved high performance that is capable of supporting a high network speed of up to 9.5 Gbps on an NVIDIA GTX285 system (which has 240 processing cores). Our implementation provides up to 4.75 times higher performance in terms of encoding and decoding throughput as compared to the Intel 8-core system.

Hardware Implementation of Context Modeler in HEVC CABAC Decoder (HEVC CABAC 복호기의 문맥 모델러 설계)

  • Kim, Sohyun;Kim, Doohwan;Lee, Seongsoo
    • Journal of IKEEE
    • /
    • v.21 no.3
    • /
    • pp.280-283
    • /
    • 2017
  • HEVC (high efficiency video coding) exploits CABAC (context-based adaptive binary arithmetic coding) for entropy coding, where a context model estimates the probability for each syntax element. In this paper, a context modeler was designed and implemented for CABAC decoding. lookup table was used to reduce computation and to increase speed. 12 simulations for HEVC standard test sequences and encoder configurations were performed, and the context modeler was verified to perform correction operations. The designed context modeler was synthesized in 0.18um technology. Maximum frequency, maximum throughput, and gate count are 200 MHz, 200 Mbin/s, and 29,268 gates, respectively.