• Title/Summary/Keyword: Decoding latency

Search Result 59, Processing Time 0.024 seconds

Syndrome Check aided Fast-SSCANL Decoding Algorithm for Polar Codes

  • Choangyang Liu;Wenjie Dai;Rui Guo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.5
    • /
    • pp.1412-1430
    • /
    • 2024
  • The soft cancellation list (SCANL) decoding algorithm for polar codes runs L soft cancellation (SCAN) decoders with different decoding factor graphs. Although it can achieve better decoding performance than SCAN algorithm, it has high latency. In this paper, a fast simplified SCANL (Fast-SSCANL) algorithm that runs L independent Fast-SSCAN decoders is proposed. In Fast-SSCANL decoder, special nodes in each factor graph is identified, and corresponding low-latency decoding approaches for each special node is propose first. Then, syndrome check aided Fast-SSCANL (SC-Fast-SSCANL) algorithm is further put forward. The ordinary nodes satisfied the syndrome check will execute hard decision directly without traversing the factor graph, thereby reducing the decoding latency further. Simulation results show that Fast-SSCANL and SC-Fast-SSCANL algorithms can achieve the same BER performance as the SCANL algorithm with lower latency. Fast-SSCANL algorithm can reduce latency by more than 83% compared with SCANL, and SC-Fast-SSCANL algorithm can reduce more than 85% latency compared with SCANL regardless of code length and code rate.

Low-latency SAO Architecture and its SIMD Optimization for HEVC Decoder

  • Kim, Yong-Hwan;Kim, Dong-Hyeok;Yi, Joo-Young;Kim, Je-Woo
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.3 no.1
    • /
    • pp.1-9
    • /
    • 2014
  • This paper proposes a low-latency Sample Adaptive Offset filter (SAO) architecture and its Single Instruction Multiple Data (SIMD) optimization scheme to achieve fast High Efficiency Video Coding (HEVC) decoding in a multi-core environment. According to the HEVC standard and its Test Model (HM), SAO operation is performed only at the picture level. Most realtime decoders, however, execute their sub-modules on a Coding Tree Unit (CTU) basis to reduce the latency and memory bandwidth. The proposed low-latency SAO architecture has the following advantages over picture-based SAO: 1) significantly less memory requirements, and 2) low-latency property enabling efficient pipelined multi-core decoding. In addition, SIMD optimization of SAO filtering can reduce the SAO filtering time significantly. The simulation results showed that the proposed low-latency SAO architecture with significantly less memory usage, produces a similar decoding time as a picture-based SAO in single-core decoding. Furthermore, the SIMD optimization scheme reduces the SAO filtering time by approximately 509% and increases the total decoding speed by approximately 7% compared to the existing look-up table approach of HM.

Design and Architecture of Low-Latency High-Speed Turbo Decoders

  • Jung, Ji-Won;Lee, In-Ki;Choi, Duk-Gun;Jeong, Jin-Hee;Kim, Ki-Man;Choi, Eun-A;Oh, Deock-Gil
    • ETRI Journal
    • /
    • v.27 no.5
    • /
    • pp.525-532
    • /
    • 2005
  • In this paper, we propose and present implementation results of a high-speed turbo decoding algorithm. The latency caused by (de)interleaving and iterative decoding in a conventional maximum a posteriori turbo decoder can be dramatically reduced with the proposed design. The source of the latency reduction is from the combination of the radix-4, center to top, parallel decoding, and early-stop algorithms. This reduced latency enables the use of the turbo decoder as a forward error correction scheme in real-time wireless communication services. The proposed scheme results in a slight degradation in bit error rate performance for large block sizes because the effective interleaver size in a radix-4 implementation is reduced to half, relative to the conventional method. To prove the latency reduction, we implemented the proposed scheme on a field-programmable gate array and compared its decoding speed with that of a conventional decoder. The results show an improvement of at least five fold for a single iteration of turbo decoding.

  • PDF

High performance Viterbi decoder using Modified Register Exchange methods (Modified Register Exchange 방식을 이용한 고성능 비터비 디코더 설계)

  • 한재선;이찬호
    • Proceedings of the IEEK Conference
    • /
    • 2003.07b
    • /
    • pp.803-806
    • /
    • 2003
  • 본 논문에서는 traceback 동작 없이 decoding이 가능한 Modified Register Exchange 방식을 이용하여 이를 block decoding에 적용하는 비터비 decoding 방식을 제안하였다. Modified Register Exchange 방식을 block decoding에 적용함으로써 decision bit 들을 결정하기 위해 필요한 동작 사이클을 줄였고, block decoding을 사용하는 기존의 비터비 디코더보다 더 적은 latency 가지게 되었다. 뿐만 아니라, 메모리를 더 효율적으로 사용할 수 있으면서 하드웨어의 구현에 있어서도 복잡도가 더 감소하게 된다. 제안된 방식은 같은 하드웨어 복잡도로도 메모리의 감소 또는 latency 의 감소에 중점을 둔 설계가 가능하다.

  • PDF

Generalized SCAN Bit-Flipping Decoding Algorithm for Polar Code

  • Lou Chen;Guo Rui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.4
    • /
    • pp.1296-1309
    • /
    • 2023
  • In this paper, based on the soft cancellation (SCAN) bit-flipping (SCAN-BF) algorithm, a generalized SCAN bit-flipping (GSCAN-BF-Ω) decoding algorithm is carried out, where Ω represents the number of bits flipped or corrected at the same time. GSCAN-BF-Ω algorithm corrects the prior information of the code bits and flips the prior information of the unreliable information bits simultaneously to improve the block error rate (BLER) performance. Then, a joint threshold scheme for the GSCAN-BF-2 decoding algorithm is proposed to reduce the average decoding complexity by considering both the bit channel quality and the reliability of the coded bits. Simulation results show that the GSCAN-BF-Ω decoding algorithm reduces the average decoding latency while getting performance gains compared to the common multiple SCAN bit-flipping decoding algorithm. And the GSCAN-BF-2 decoding algorithm with the joint threshold reduces the average decoding latency further by approximately 50% with only a slight performance loss compared to the GSCAN-BF-2 decoding algorithm.

Comparison on Recent Decoding Methods for Polar Codes based on Successive-Cancellation Decoding (연속 제거 복호기반의 최신 극 부호 복호기법 비교)

  • Choi, Soyeon;Yoo, Hoyoung
    • Journal of IKEEE
    • /
    • v.24 no.2
    • /
    • pp.550-558
    • /
    • 2020
  • Successive cancellation (SC) decoding that is one of the decoding algorithms for polar codes has long decoding latency and low throughput because of the nature of successive decoding. To reduce the latency and increase the throughput, various decoding structures for polar codes are presented. In this paper, we compare the previous decoding structures and analyze them by dividing into two types, pruning and multi-path decoders. Decoders for applying pruning are representative of SSC (simplified SC), Fast-SSC and redundant-LLR structures, and decoders with multi-path are representative of 2-bit SC and redundant-LLR structures. All the previous structures are compared in terms decoding latency and hardware area, and according to the comparison, the syndrome check based decoder has the lowest latency and redundant-LLR decoder has the highest hardware efficiency.

Recent Successive Cancellation Decoding Methods for Polar Codes

  • Choi, Soyeon;Lee, Youngjoo;Yoo, Hoyoung
    • Journal of Semiconductor Engineering
    • /
    • v.1 no.2
    • /
    • pp.74-80
    • /
    • 2020
  • Due to its superior error correcting performance with affordable hardware complexity, the Polar code becomes one of the most important error correction codes (ECCs) and now intensively examined to check its applicability in various fields. However, Successive Cancellation (SC) decoding that brings the advanced Successive Cancellation List (SCL) decoding suffers from the long latency due to the nature of serial processing limiting the practical implementation. To mitigate this problem, many decoding architectures, mainly divided into pruning and parallel decoding, are presented in previous manuscripts. In this paper, we compare the recent SC decoding architectures and analyze them using a tree structure.

An FPGA Design of High-Speed Turbo Decoder

  • Jung Ji-Won;Jung Jin-Hee;Choi Duk-Gun;Lee In-Ki
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.6C
    • /
    • pp.450-456
    • /
    • 2005
  • In this paper, we propose a high-speed turbo decoding algorithm and present results of its implementation. The latency caused by (de)interleaving and iterative decoding in conventional MAP turbo decoder can be dramatically reduced with the proposed scheme. The main cause of the time reduction is to use radix-4, center to top, and parallel decoding algorithm. The reduced latency makes it possible to use turbo decoder as a FEC scheme in the real-time wireless communication services. However the proposed scheme costs slight degradation in BER performance because the effective interleaver size in radix-4 is reduced to an half of that in conventional method. To ensure the time reduction, we implemented the proposed scheme on a FPGA chip and compared with conventional one in terms of decoding speed. The decoding speed of the proposed scheme is faster than conventional one at least by 5 times for a single iteration of turbo decoding.

A 18-Mbp/s, 8-State, High-Speed Turbo Decoder

  • Jung Ji-Won;Kim Min-Hyuk;Jeong Jin-Hee
    • Journal of electromagnetic engineering and science
    • /
    • v.6 no.3
    • /
    • pp.147-154
    • /
    • 2006
  • In this paper, we propose and present implementation results of a high-speed turbo decoding algorithm. The latency caused by (de) interleaving and iterative decoding in a conventional maximum a posteriori(MAP) turbo decoder can be dramatically reduced with the proposed design. The source of the latency reduction is come from the combination of the radix-4, dual-path processing, parallel decoding, and rearly-stop algorithms. This reduced latency enables the use of the turbo decoder as a forward error correction scheme in real-time wireless communication services. The proposed scheme results in a slight degradation in bit-error rate(BER) performance for large block sizes because the effective interleaver size in a radix-4 implementation is reduced to half, relative to the conventional method. Fixed on the parameters of N=212, iteration=3, 8-states, 3 iterations, and QPSK modulation scheme, we designed the adaptive high-speed turbo decoder using the Xilinx chip (VIRTEX2P (XC2VP30-5FG676)) with the speed of 17.78 Mb/s. From the results, we confirmed that the decoding speed of the proposed decoder is faster than conventional algorithms by 8 times.

Low-Latency Polar Decoding for Error-Free and Single-Error Cases (단일 비트 이하 오류 정정을 위한 극 부호용 선 처리 복호기법)

  • Choi, Soyeon;Yoo, Hoyoung
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1168-1174
    • /
    • 2018
  • For the initial state of NAND flash memories, error-free and single-error cases are dominant due to a good channel environment on memory cells. It is important to deal with such cases, which affects the overall system performance. However, the conventional schemes for polar codes equally decode the codes even for the error-free and single-error cases since they cannot classify and decode separately. In this paper, a new pre-processing scheme for polar codes is proposed so as to improve the overall decoding latency by decoding the frequent error-free and single-error cases. Before the ordinary decoding process, the proposed scheme first decodes the frequent error-free and single-error cases. According to the experimental results, the proposed pre-processing scheme decreases the average decoding latency by 64% compared to the conventional scheme for (1024, 512) polar codes.