DOI QR코드

DOI QR Code

Computationally Efficient Implementation of a Hamming Code Decoder Using Graphics Processing Unit

  • Islam, Md Shohidul (School of Electrical Engineering, University of Ulsan) ;
  • Kim, Cheol-Hong (School of Electronics and Computer Engineering, Chonnam National University) ;
  • Kim, Jong-Myon (School of Electrical Engineering, University of Ulsan)
  • Received : 2014.02.05
  • Accepted : 2014.12.01
  • Published : 2015.04.30

Abstract

This paper presents a computationally efficient implementation of a Hamming code decoder on a graphics processing unit (GPU) to support real-time software-defined radio, which is a software alternative for realizing wireless communication. The Hamming code algorithm is challenging to parallelize effectively on a GPU because it works on sparsely located data items with several conditional statements, leading to non-coalesced, long latency, global memory access, and huge thread divergence. To address these issues, we propose an optimized implementation of the Hamming code on the GPU to exploit the higher parallelism inherent in the algorithm. Experimental results using a compute unified device architecture (CUDA)-enabled NVIDIA GeForce GTX 560, including 335 cores, revealed that the proposed approach achieved a 99x speedup versus the equivalent CPU-based implementation.

Keywords

Acknowledgement

Supported by : National Research Foundation of Korea (NRF)

References

  1. U. Ramacher, "Software-defined radio prospects for multistandard mobile phones," IEEE Computer, vol. 40, no. 10, pp. 62-69, Oct. 2007.
  2. T. Yazdi et al., "Gallager B decoder on noisy hardware," IEEE Trans. Commun., vol. 61, no. 5, pp. 1660-1673, May 2013. https://doi.org/10.1109/TCOMM.2013.031213.120153
  3. S. Gronroos, K. Nybom, and J. Bjorkqvist, "Complexity analysis of software defined DVB-T2 physical layer," J. Analog Integrated Circuits Signal Process., vol. 69, no. 2-3, pp. 131-142, Dec. 2011. https://doi.org/10.1007/s10470-011-9724-4
  4. A. Refaey, S. Roy, and P. Fortier, "A new approach for FEC decoding based on the BP algorithm in LTE and WiMAX systems," in Proc. IEEE CWIT, May 2011, pp. 9-14.
  5. E. Nicollet, "Standardizing transceiver APIs for software defined and cognitive radio," RF Design, vol. 47, no. 1, pp. 16-20, Feb. 2008.
  6. M. May et al., "A 150Mbit/s 3GPP LTE turbo code decoder," in Proc. ACM DATE, 2010, pp. 1420-1425.
  7. Y. Lin et al., "SODA: A high-performance DSP architecture for software-defined radio," IEEE Micro, vol. 27, no. 1, pp. 114-123, Jan.-Feb. 2007. https://doi.org/10.1109/MM.2007.22
  8. J. Y. Park and K. S. Chung, "Parallel LDPC decoding using CUDA and OpenMP," EURASIP J. Wirel. Commun. Netw., Springer, vol. 11, no. 1, pp. 172-180, Nov. 2011.
  9. M. Palkovic et al., "Future software-defined radio platforms and mapping flows," IEEE Signal Process. Mag. , vol. 27, no. 2, pp. 22-33, Mar. 2010. https://doi.org/10.1109/MSP.2009.935386
  10. W. Tuttlebee, "The software defined radio: Enabling technologies," John Wiley & Sons, 2002.
  11. J. Kim, S. Hyeon, and S. Choi, "Implementation of an SDR system using graphics processing unit," IEEE Commun. Mag., vol. 48, no. 3, pp. 156-162, Mar. 2010. https://doi.org/10.1109/MCOM.2010.5434388
  12. T. Beluch et al., "Mostly digital wireless ultrawide band communication architecture for software defined radio," IEEE Microw. Mag., vol. 13, no. 1, pp. 132-138, Jan.-Feb. 2012. https://doi.org/10.1109/MMM.2011.2174121
  13. M. Wu et al., "Implementation of a high throughput soft MIMO detector on GPU," J. Signal Process. Syst., vol. 64, no. 1, pp. 123-136, 2011. https://doi.org/10.1007/s11265-010-0523-4
  14. H. Lee, C. Chakrabarti, and T. Mudge, "A low-power DSP for wireless communications," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 18, no. 9, pp. 1310-1322, Sept. 2010. https://doi.org/10.1109/TVLSI.2009.2023547
  15. C. S. Lin et al., "A tiling-scheme viterbi decoder in software defined radio for GPUs," in Proc. WICOM, Sept. 2011, pp. 1-4.
  16. V. B. Alluri, J. R. Heath, and M. Lhamon, "A new multichannel, coherent amplitude modulated, time-division multiplexed, software-defined radio receiver architecture, and field-programmabale-gate-array technology implementation," IEEE Trans. Signal Process., vol. 58, no. 10, pp. 5369-5384, 2010. https://doi.org/10.1109/TSP.2010.2056921
  17. L. Hu, S. Nooshabadi, and T. Mladenov, "Forward error correction with Raptor GF(2) and GF(256) codes on GPU," IEEE Trans. Consum. Electron., vol. 59, no. 1, pp. 273-280, Feb. 2013. https://doi.org/10.1109/TCE.2013.6490270
  18. Y. Zhao and F. C. M. Lau, "Implementation of decoders for LDPC block codes and LDPC convolutional codes based on GPUs," IEEE Trans. Parallel Distrib. Syst., vol. 25, no. 3, pp. 663-672, Mar. 2014. https://doi.org/10.1109/TPDS.2013.52
  19. W. Michael et al., "Implementation of a high throughput 3GPP turbo decoder on GPU," J. Signal Process Syst., vol. 65, no. 1, pp. 171-183, 2011. https://doi.org/10.1007/s11265-011-0617-7
  20. F. J. Martinez-Zaldivar et al., "Tridimensional block multiword LDPC decoding on GPUs," J. Supercomput., vol. 58, no. 3, pp. 314-322, 2011. https://doi.org/10.1007/s11227-011-0587-3
  21. R. Li et al., "A multi-standard efficient column-layered LDPC decoder for software defined radio on GPUs," in Proc. IEEE SPAWC, June 2013, pp. 724-728.
  22. R. Li et al., "A fully parallel truncated Viterbi decoder for software defined radio on GPUs," in Proc. IEEE WCNC, Apr. 2013, pp. 4305-4310.
  23. W. Michael et al., "Implementation of a high throughput soft MIMO detector on GPU," J. Signal Process Syst., vol. 64, no. 1, pp. 123-136, 2011. https://doi.org/10.1007/s11265-010-0523-4
  24. C. Ahn et al., "Implementation of an SDR platform using GPU and its application to a 2x2 MIMO WiMAX system," J. Analog Integr. Circuits Signal Process., vol. 69, no. 2, pp. 107-117, 2011. https://doi.org/10.1007/s10470-011-9764-9
  25. R. W. Hamming, "Error detecting and error correcting codes," Bell Syst. Technical Journal, vol. 26, no. 2, pp. 147-160, 1950.
  26. J. Xu, K. Li, and G.Min, "Reliable and energy-efficient multipath communications in underwater sensor networks," IEEE Trans. Parallel Distrib. Syst., vol. 23, no. 7, pp. 1326-1335, July 2012. https://doi.org/10.1109/TPDS.2011.266
  27. R. Ma and S. Cheng, "The universality of generalized hamming code for multiple sources," IEEE Trans. Commun., vol. 59, no. 10, pp. 2641-2647, Oct. 2011 . https://doi.org/10.1109/TCOMM.2011.081711.100211
  28. S. Islam and J. Kim, "Accelerating extended hamming code decoders on graphic processing units for high speed communication," IEICE Trans. Commun., vol. E97-B, no. 5, pp. 1050-1058, May 2014. https://doi.org/10.1587/transcom.E97.B.1050
  29. A. G. Amat, C. A. Nour, and C. Douillard, "Serially concatenated continuous phase modulation for satellite communications," IEEE Trans.Wireless Commun., vol. 8, no. 6, pp. 3260-3269, June 2009. https://doi.org/10.1109/TWC.2009.081051
  30. C. Argyrides et al., "Decimal Hamming: A software-implemented technique to cope with soft errors," in Proc. DFT, Oct. 2011, pp. 11-17.
  31. R. Ma and S. Cheng, "Hamming coding for multiple sources," in Proc. ISIT, June 2010, pp. 171-175.