Browse > Article
http://dx.doi.org/10.3837/tiis.2013.08.010

From WiFi to WiMAX: Efficient GPU-based Parameterized Transceiver across Different OFDM Protocols  

Li, Rongchun (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology)
Dou, Yong (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology)
Zhou, Jie (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology)
Li, Baofeng (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology)
Xu, Jinbo (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.7, no.8, 2013 , pp. 1911-1932 More about this Journal
Abstract
Orthogonal frequency-division multiplexing (OFDM) has become a popular modulation scheme for wireless protocols because of its spectral efficiency and robustness against multipath interference. Although the components of various OFDM protocols are functionally similar, they remain distinct because of the characteristics of the environment. Recently, graphics processing units (GPUs) have been used to accelerate the signal processing of the physical layer (PHY) because of their great computational power, high development efficiency, and flexibility. In this paper, we describe the implementation of parameterized baseband modules using GPUs for two different OFDM protocols, namely, 802.11a and 802.16. First, we introduce various modules in the modulator/demodulator parts of the transmitter and receiver and analyze the computational complexity of each module. We then describe the integration of the GPU-based baseband modules of the two protocols using the parameterized method. GPU-based implementations are addressed to explain how to accelerate the baseband processing to archive real-time throughput. Finally, the performance results of each signal processing module are evaluated and analyzed. The experiments show that the GPU-based 802.11a and 802.16 PHY meet the real-time requirement and demonstrate good bit error ratio (BER) performance. The performance comparison indicates that our GPU-based implemented modules have better flexibility and throughput to the current ones.
Keywords
GPU; SDR; OFDM; WiFi; WiMAX;
Citations & Related Records
연도 인용수 순위
  • Reference
1 T. Nylanden, J. Janhunen, O. Silven and M. Juntti, "A GPU implementation for two MIMO-OFDM detectors," in Proc. of International Conf. Embedded Computer Systems: Architectures, Modeling and Simulation, pp. 293-300, July 19-22, 2010.
2 M. Wu, Y. Sun, S. Gupta and J. R. Cavallaro, "Implementation of a high throughput soft MIMO detector on GPU," Journal of Signal Processing Systems, vol. 64, no. 1, pp. 123-136, July, 2011.   DOI
3 G. Falcao, L. Sousa and V. Silva, "Massively LDPC decoding on multicore architectures," IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 2, pp. 309-322, February, 2011.   DOI   ScienceOn
4 IEEE, "Std 802.11a-1999, Part 11: wireless LAN, medium access control (MAC) and physical layer (PHY) specifications: high-speed physical layer in the 5 GHz band, supplement to IEEE 802.11 Standard," 1999.
5 IEEE, "IEEE standard 802.16. Air interface for fixed broadband wireless access systems," 2004.
6 S. Choi, K. Kang and S. Choi, "A two-stage radix-4 Viterbi decoder for multiband OFDM UWB system," ETRI Journal, vol. 30, no. 6, pp. 850-852, December, 2008.   DOI   ScienceOn
7 NVIDIA Corporation, "CUBLAS Library version 4.0," 2011.
8 Texas Instruments, "TMS320C64x DSP Library Programmer's Reference," 2002.
9 F. J. Martinez-Zaldivar, A. M. Vidal-Macia, A. Gonzalez and V. Almenar, "Tridimensional block multiword LDPC decoding on GPUs," Journal of Supercomputing, vol. 58, no. 3, pp. 314-322, December, 2011.   DOI
10 H. Ji, J. Cho and W. Sung, "Memory access optimized implementation of cyclic and quasi-cyclic LDPC codes on a GPGPU," Journal of Signal Processing System, vol. 64, no. 1, pp. 149-159, July 2011.   DOI
11 NVIDIA Corporation, "NVIDIA CUDA Compute Unified Device Architecture Programming Guide version 4.0," 2011.
12 C. Yang, Q. Wu, T. Tang, F. Wang, and J. Xue, "Programming for scientific computing on peta-scale heterogeneous parallel systems," Journal of Central South University, vol. 20, no. 5, pp. 1189-1203, May, 2013.   DOI   ScienceOn
13 X. Yang, T. Tang, G. Wang, J. Jia, and X. Xu, "MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems," Science China-information Sciences, vol. 55, no. 9, pp. 1961-1971, September, 2012.
14 C. Yang, Q. Wu, H. Hu, Z. Shi, J. Chen, and T. Tang, "Fast weighting method for plasma PIC simulation on GPU-accelerated heterogeneous systems," Journal of Central South University, vol. 20, no. 6, pp. 1527-1535 , June, 2013.   DOI   ScienceOn
15 S. Gronroos, K. Nybom and J. Bjorkqvist, "Complexity analysis of software defined DVB-T2 physical layer," Analog Integrated Circuits and Signal Processing, vol. 69, no. 2-3, pp. 131-142, December, 2011.   DOI
16 M. Wu, Y. Sun, and J. R. Cavallaro, "Implementation of a 3GPP LTE turbo decoder accelerator on GPU," in Proc. of IEEE Workshop Signal Processing Systems, pp. 192-197, October, 2010.
17 C. Lin, W. Liu, W. Yeh, L. Chang, W. Hwu, S. Chen, and P. Hsiung, "A Tiling-Scheme Viterbi Decoder in Software Defined Radio for GPUs," in Proc. of 2011 7th International Conf. Wireless Communications, Networking and Mobile Computing, pp. 1-4, September 23-25, 2011.
18 J. Kim, H. Seungheon and C. Seungwon, "Implementation of an SDR system using graphics processing unit," IEEE Communication Magazine, vol. 48, no. 3, pp. 156-162, March, 2010.
19 C. Ahn, J. Kim, J. Ju, J. Choi, B. Choi and S. Choi, "Implementation of an SDR platform using GPU and its application to a 2x2 MIMO WiMAX system," Analog Integrated Circuits and Signal Processing, vol. 69, no. 2, pp. 107-117, December, 2011.   DOI
20 A. T. Tran, D. N. Truong, and B. M. Baas, "A complete real-time 802.11a baseband receiver implemented on an array of programmable processors," in Proc. of 42nd Asilomar Conference Signals, Systems and Computer, pp. 165-170, October 26-29, 2008.
21 C. Ahn, S. Bang, H. Kim, S. Lee, J. Kim, S. Choi, and J. Glossner, "Implementation of an SDR system using an MPI-based GPU cluster for WiMAX and LTE," Analog Integrated Circuits and Signal Processing, vol. 73, no. 2, pp. 569-582, November, 2012.   DOI
22 Z. Yu, M. J. Meeuwsen, R. W. Apperson, O. Sattari, M. A. Lai, J. W. Webb, E. W. Work, T. Mohsenin, and B. M. Baas, "Architecture and evaluation of an asynchronous array of simple processors," Journal of Signal Processing Systems, vol. 53, no. 3, pp. 243-259, December, 2008.   DOI
23 H. Lee, C. Chakrabarti, and T. Mudge, "A low-power DSP for wireless communications," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 18, no. 9, pp. 1310-1322, September, 2010.   DOI   ScienceOn
24 M. Mizani, and D. Rakhmatov, "Multi-clock pipelined design of an IEEE 802.11a physical layer transmitter," in Proc. of 20th International Parallel and Distributed Processing Symposium, pp. 21-27, April 25-29, 2006.
25 J. S. Park and T. Ogunfunmi, "Efficient FPGA-Based Implementations of MIMO-OFDM Physical Layer," Circuits Systems and Signal Processing, vol. 31, no. 4, pp. 1487-1511, August, 2012.   DOI
26 M. J. Canet, J. Valls, V. Almenar and J. Marin-Roig, "FPGA implementation of an OFDM-based WLAN receiver," Microprocessors and Microsystems, vol. 36, no. 3, pp. 232-244, May, 2012.   DOI   ScienceOn
27 R. W. Chang, "Symthesis of band-limited orthogonal signals for mulltichannel data transmission," Bell System Technical Kournal, vol. 45, pp. 1775-1796, 1966.   DOI