Search | Korea Science

Large-scale 3D fast Fourier transform computation on a GPU

Jaehong Lee;Duksu Kim
- ETRI Journal
- /
- v.45 no.6
- /
- pp.1035-1045
- /
- 2023
We propose a novel graphics processing unit (GPU) algorithm that can handle a large-scale 3D fast Fourier transform (i.e., 3D-FFT) problem whose data size is larger than the GPU's memory. A 1D FFT-based 3D-FFT computational approach is used to solve the limited device memory issue. Moreover, to reduce the communication overhead between the CPU and GPU, we propose a 3D data-transposition method that converts the target 1D vector into a contiguous memory layout and improves data transfer efficiency. The transposed data are communicated between the host and device memories efficiently through the pinned buffer and multiple streams. We apply our method to various large-scale benchmarks and compare its performance with the state-of-the-art multicore CPU FFT library (i.e., fastest Fourier transform in the West [FFTW]) and a prior GPU-based 3D-FFT algorithm. Our method achieves a higher performance (up to 2.89 times) than FFTW; it yields more performance gaps as the data size increases. The performance of the prior GPU algorithm decreases considerably in massive-scale problems, whereas our method's performance is stable.
https://doi.org/10.4218/etrij.2022-0297 인용 PDF

FMCW RADAR SIGNAL PROCESS USING REAL FFT (Real FFT를 이용한 FMCW 레이더 신호처리)

Kim, Min-Joon;Cheon, I-Hwan;Kim, Ju-Hyun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.11 no.12
- /
- pp.2227-2232
- /
- 2007
In this paper, it is presented a Real FFT for the FMCW radar distance measurement with high resolution. The high distance resolution needs the measurement of the accurate beat frequency. To improve the distance resolution, zoom fft, decimation, digital low pass filter and zero padding method are used. The simulation results using the Matlab show ${\pm}5mm$ of distance resolution and the measuring range is up to 35meter.
https://doi.org/10.6109/jkiice.2007.11.12.2227 인용 PDF KSCI

Design of Current-to-Voltage Converter for the Current-mode FFT LSI in 0.35um processing (0.35um 공정에서 OFDM 용 전류모드 FFT LSI를 위한 I-V Converter 설계)

Bae, Seong-Ho;Hong, Sun-Yang;Jeon, Seong-Yong;Kim, Seong-Gwon
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2007.04a
- /
- pp.469-472
- /
- 2007
최근 많은 광대역 유무선 통신 응용분야에서 OFDM(Orthogonal Frequency Division Multiplexing) 방식을 표준기술로 채택하고 있다 OFDM 방식의 고속 무선 데이터 통신를 위한 FFT 프로세서는 일반적으로 DSP(Digital Signal Processing)로 구현되었으나, 큰 전력 소비를 필요로 한다. OFDM의 단점인 전력문제를 보안하기 위해서 Current-mode FFT LSI가 제안되었다. 본 논문에서는 Current-mode FFT LSI의 구현을 위한 저전력 IVC를 설계하였다. 설계된 IVC는 FFT Block의 출력이 $13.65{\mu}A$ 이상일 때에 3V 이상의 전압을 출력하고, FFT Block의 출력이 $0.15{\mu}A$ 이하일 때에 0.5V 이하의 전압을 출력한다. 그리고 IVC의 총 소모전력은 약 1.65mW이다. $0.35{\mu}A$ 공정에서의 저전력 IVC를 설계함으로서, $0.35{\mu}A$ 공정에서의 Current-mode FFT LSI의 설계가 가능해졌다. 저전력 OFDM 통신용 Current-mode FFT LSI는 무선통신의 발전에 기여할 것으로 전망한다.
PDF

Low-power Single-Chip Current-to-Voltage Converter for Wireless OFDM Terminal Modem (OFDM 용 무선통신단말기 모뎀의 저소비 전력화를 위한 단일칩용 I-V 컨버터)

Kim, Seong-Kweon
- Journal of the Korean Institute of Intelligent Systems
- /
- v.17 no.4
- /
- pp.569-574
- /
- 2007
최근 많은 광대역 유무선 통신 응용분야에서 OFDM(Orthogonal Frequency Division Multiplexing) 방식을 표준기술로 채택하고 있다. OFDM 방식의 고속 무선 데이터 통신을 위한 FFT 프로세서는 일반적으로 DSP(Digital Signal Processing)로 구현되었으나, 큰 전력 소비를 필요로 한다. 따라서, OFDM 통신방식의 단점인 전력문제를 보완하기 위해서 전류모드 FFT LSI가 제안되었고, 저소비전력 전류모드 FFT LSI를 동작시키기 위해서는 전류모드를 전압모드로 바꾸는 VIC(Voltage to Current Converter) 그리고 다시 전류모드를 전압모드로 바꾸어 주는 IVC(Current to Voltage Converter)가 필요하다. 그러나, OP-AMP로 구현되는 종래의 IVC는 회로규모가 크고, 전력소비가 크며, LSI 내에 크고 정확한 높은 저항을 필요로 한다. 또한 전류모드신호처리에서 많이 이용되는 Current Mirror 회로 등의 출력단자로부터 전류신호를 입력받은 경우, 입력단자간의 전위차가 발생하며, DC offset 전류가 발생하는 등의 문제점을 갖는다. 따라서 본 연구에서는 저전력 동작이 가능하고, 향후, single chip 응용이 가능한 IVC를 $0.35{\mu}m$ 공정에서 설계함으로서, $0.35{\mu}m$ 공정에서의 전류모드 FFT LSI의 전압모드 출력이 가능해졌다 설계된 IVC는 FFT LSI의 출력이 디지털신호로 환산한 ${\pm}1$인 점을 감안하여, 전류모드 FFT LSI의 출력이 $13.65{\mu}A$ 이상일 때에 3.0V의 전압을 출력하고, FFT LSI의 출력이 $0.15{\mu}A$ 이하일 때에 0.5V 이하의 전압을 출력하도록 하였으며, IVC의 총 소비전력은 약 1.65mV이하로 평가되었다.
https://doi.org/10.5391/JKIIS.2007.17.4.569 인용 PDF KSCI

Improvement in computing times by the elimination of redundancies in existing DFT and FFT (DFT 및 FFT에 있어서의 Redundancies와 그의 제거에 의한 Fourier 변환고속화)

안수길
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.14 no.6
- /
- pp.26-30
- /
- 1977
Redundancies in the Calculation of DFT and FFT are analized and new algorithms are proposed which are capable of reducing the machine time by a considerable amount. New extensions of T.D C.F. and T.D.F.T. are given for the discrete case which permit a deeper insights for the techniques of digital signal Proessing i. e. Discrete Fourier Transform, Convolution Sum and Correlation sequences.
PDF

Parameterized FFT/IFFT Core Generator for ODFM Modulation/Demodulation (OFDM 변복조를 위한 파라메터화된 FFT/IFFT 코어 생성기)

Lee, J.W.;Kim, J.H.;Shin, K.W.;Baek, Y.S.;Eo, I.S.
- Proceedings of the IEEK Conference
- /
- 2005.11a
- /
- pp.659-662
- /
- 2005
A parameterized FFT/IFFT core generator (PFFT_CoreGen) is designed, which can be used as an essential IP (Intellectual Property) in various OFDM modem designs. The PFFT_CoreGen generates Verilog-HDL models of FFT cores in the range of 64 ${\sim}$ 2048-point. To optimize the performance of the generated FFT cores, the PFFT_CoreGen can select the word-length of input data, internal data and twiddle factors in the range of 8-b ${\sim}$ 24-b. Some design techniques for low-power design are considered from algorithm level to circuit level.
PDF

Design and Implementation Systolic Array FFT Processor Based on Shared Memory (공유 메모리 기반 시스토릭 어레이 FFT 프로세서 설계 및 구현)

Jeong, Dongmin;Roh, yunseok;Son, Hanna;Jung, Yongchul;Jung, Yunho
- Journal of IKEEE
- /
- v.24 no.3
- /
- pp.797-802
- /
- 2020
In this paper, we presents the design and implementation results of the FFT processor, which supports 4096 points of operation with less memory by sharing several memory used in the base-4 systolic array FFT processor into one memory. Sharing memory provides the advantage of reducing the area, and also simplifies the flow of data as I/O of the data progresses in one memory. The presented FFT processor was implemented and verified on the FPGA device. The implementation resulted in 51,855 CLB LUTs, 29,712 CLB registers, 8 block RAM tiles and 450 DSPs, and confirmed that the memory area could be reduced by 65% compared to the existing base-4 systolic array structure.
https://doi.org/10.7471/ikeee.2020.24.3.797 인용 PDF KSCI

Design of Radix-4 FFT Processor Using Twice Perfect Shuffle (이중 완전 Shuffle을 이용한 Radix-4 FFT 프로세서의 설계)

Hwang, Myoung-Ha;Hwang, Ho-Jung
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.27 no.2
- /
- pp.144-150
- /
- 1990
This paper describes radix-4 Fast Fourier Transform (FFT) Processor designed with the new twice perfect shuffle developed from a perfect shuffle used in radix-2 FFT algorithm. The FFT Processor consists of a butterfly arithmetic circuit, address generators for input, output and coefficient, input and output registers and controller. Also, it requires the external ROM for storage of coefficient and RAM for input and output. The butterfly circuit includes 12 bit-serial ($16{\times}8$) multipliers, adders, subtractors and delay shift registers. Operating on 25 MHz two phase clock, this processor can compute 256 point FFT in 6168 clocks, i.e. 247 us and provides flexibility by allowing the user to select any size among 4,16,64,and256points. Being fabricated with 2-um double metal CMOS process, it includes about 28000 transistors and 55 pads in $8.0{\times}8.2mm^2$area.
PDF

Performance Analysis of OFDM with I mproved Dual Adaptive Equalizer in microwave band Tow-path Channel Environments (마이크로파 대역 Tow-path 채널 환경에서 개선된 Dual 적응 등화기를 적용한 OFDM 시스템의 성능 분석)

Kim, Jang-Sook
- Journal of the Korea Society of Computer and Information
- /
- v.14 no.7
- /
- pp.57-64
- /
- 2009
Based on this article, I have analyzed the OFDM system which applies three types of equalizer forms in the two-path channel of the microwave baseband. The Two-path channel of microwave baseband had been simulated through the Rummler channel. In the Two-path channel, the OFDM system which has three forms of equalizer has been analyzed and the result is, equalizer 1-tab has great improvement in efficiency compared with Pre-FFT 11-tab which has noise power ratio less than 18dB. On the contrary, if the symbol energy to the noise ratio is more than 18dB, the equalizer which applies Pre-FFT 11-tab has greater efficiency compared to the equalizer which applies 1-tab frequency. Last but not least, the OFDM system which applies Dual equalizer has better efficiency compared to the system which has 1-tab frequency and equalizer which applies Pre-FFT 11-tab.
https://doi.org/10.9708/jksci.2009.14.7.057 인용 PDF

An Experimental Study on the Fitting of 64 Channel Digital Hearing Aid by In-situ Method (64채널 디지털 보청기의 In-situ에 의한 휘팅 실험 연구)

Jarng, Soon-Suck
- The Journal of the Acoustical Society of Korea
- /
- v.31 no.5
- /
- pp.273-279
- /
- 2012
In this thesis, a nonlinear compression fitting method was studied for each frequency channel of a 64 channel digital hearing aid. Unlike conventional fitting formula method done from the result of the hearing loss test, the present fitting method uses the auditory threshold of sound pressure measured near the tympanic membrane while ITE (In-The-Ear) hearing aid is fitted into the user's ear canal. Also, the spectral distribution of the voice sound pressure was used for realizing of output sound pressure compression curves against input sound pressure level. Theoretical research results of FFT-iFFT compression algorithm has been evaluated by experimental gain measurements at each different input sound pressure level 50 dB, 70 dB, 90 dB respectively.
https://doi.org/10.7776/ASK.2012.31.5.273 인용 PDF KSCI

Search Result 113, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)