• Title/Summary/Keyword: memory size reduction

Search Result 97, Processing Time 0.025 seconds

Memory Reduction Method of DIT-based IFFT Bit-Reversal (DIT 기반 IFFT의 Bit-Reversal 메모리 감소 기법)

  • Kim, Jun-Ho;Piao, Zheyan;Cho, Kyung-Ju;Chung, Jin-Gyun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.5
    • /
    • pp.66-73
    • /
    • 2015
  • IFFT is one of the key components in OFDM-based communication systems. In this paper, we propose a new memory efficient IFFT design method for OFDM-based communication systems, based on a mapping of three IFFT input signals which consist of modulated data, pilot and null signals. The proposed method focuses on reducing the memory size in the bit-reversal block which requires the largest number of memory cells in IFFT architectures. To reduce the memory size, we propose a selection mapping method based on decimation-in-time (DIT) algorithm. It is shown that the proposed method achieves a memory reduction of about 50% compared to conventional methods.

Efficient Signal Reordering Unit Implementation for FFT (FFT를 위한 효율적인 Signal Reordering Unit 구현)

  • Yang, Seung-Won;Lee, Jang-Yeol
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.6
    • /
    • pp.1241-1245
    • /
    • 2009
  • As FFT(Fast Fourier Transform) processor is used in OFDM(Orthogonal Frequency Division Multiplesing) system. According to increase requirement about mobility and broadband, Research about low power and low area FFT processor is needed. So research concern in reduction of memory size and complex multiplier is in progress. Increasing points of FFT increase memory area of FFT processor. Specially, SRU(Signal Reordering Unit) has the most memory in FFT processor. In this paper, we propose a reduced method of memory size of SRU in FFT processor. SRU of 64, 1024 point FFT processor performed implementation by VerilogHDL coding and it verified by simulation. We select the APEX20KE family EP20k1000EPC672-3 device of Altera Corps. SRU implementation is performed by synthesis of Quartus Tool. The bits of data size decide by 24bits that is 12bits from real, imaginary number respectively. It is shown that, the proposed SRU of 64point and 1024point achieve more than 28%, 24% area reduction respectively.

Memory Reduction Method of Radix-22 MDF IFFT for OFDM Communication Systems (OFDM 통신시스템을 위한 radix-22 MDF IFFT의 메모리 감소 기법)

  • Cho, Kyung-Ju
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.1
    • /
    • pp.42-47
    • /
    • 2020
  • In OFDM-based very high-speed communication systems, FFT/IFFT processor should have several properties of low-area and low-power consumption as well as high throughput and low processing latency. Thus, radix-2k MDF (multipath delay feedback) architectures by adopting pipeline and parallel processing are suitable. In MDF architecture, the feedback memory which increases in proportion to the input signal word-length has a large area and power consumption. This paper presents a feedback memory size reduction method of radix-22 MDF IFFT processor for OFDM applications. The proposed method focuses on reducing the feedback memory size in the first two stages of MDF architectures since the first two stages occupy about 75% of the total feedback memory. In OFDM transmissions, IFFT input signals are composed of modulated data and pilot, null signals. In order to reduce the IFFT input word-length, the integer mapping which generates mapped data composed of two signed integer corresponding to modulated data and pilot/null signals is proposed. By simulation, it is shown that the proposed method has achieved a feedback memory reduction up to 39% compared to conventional approach.

A novel hardware design for SIFT generation with reduced memory requirement

  • Kim, Eung Sup;Lee, Hyuk-Jae
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.13 no.2
    • /
    • pp.157-169
    • /
    • 2013
  • Scale Invariant Feature Transform (SIFT) generates image features widely used to match objects in different images. Previous work on hardware-based SIFT implementation requires excessive internal memory and hardware logic [1]. In this paper, a new hardware organization is proposed to implement SIFT with less memory and hardware cost than the previous work. To this end, a parallel Gaussian filter bank is adopted to eliminate the buffers that store intermediate results because parallel operations allow all intermediate results available at the same time. Furthermore, the processing order is changed from the raster-scan order to the block-by-block order so that the line buffer size storing the source image is also reduced. These techniques trade the reduction of memory size with a slight increase of the execution time and external memory bandwidth. As a result, the memory size is reduced by 94.4%. The proposed hardware for SIFT implementation includes the Descriptor generation block, which is omitted in the previous work [1]. The addition of the hardwired descriptor generation improves the computation speed by about 30 times when compared with the previous work.

A New Embedded Compression Algorithm for Memory Size and Bandwidth Reduction in Wavelet Transform Appliable to JPEG2000 (JPEG2000의 웨이블릿 변환용 메모리 크기 및 대역폭 감소를 위한 새로운 Embedded Compression 알고리즘)

  • Son, Chang-Hoon;Song, Sung-Gun;Kim, Ji-Won;Park, Seong-Mo;Kim, Young-Min
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.1
    • /
    • pp.94-102
    • /
    • 2011
  • To alleviate the size and bandwidth requirement in JPEG2000 system, a new Embedded Compression(EC) algorithm with minor image quality drop is proposed. For both random accessibility and low latency, very simple and efficient hadamard transform based compression algorithm is devised. We reduced LL intermediate memory and code-block memory to about half size and achieved significant memory bandwidth reductions(about 52~73%) through proposed multi-mode algorithms, without requiring any modification in JPEG2000 standard algorithm.

A File System for Large-scale NAND Flash Memory Based Storage System

  • Son, Sunghoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.9
    • /
    • pp.1-8
    • /
    • 2017
  • In this paper, we propose a file system for flash memory which remedies shortcomings of existing flash memory file systems. Besides supporting large block size, the proposed file system reduces time in initializing file system significantly by adopting logical address comprised of erase block number and bitmap for pages in the block to find a page. The file system is suitable for embedded systems with limited main memory since it has small in-memory data structures. It also provides efficient management of obsolete blocks and free blocks, which contribute to the reduction of file update time. Finally the proposed file system can easily configure the maximum file size and file system size limits, which results in portability to emerging larger flash memories. By conducting performance evaluation studies, we show that the proposed file system can contribute to the performance improvement of embedded systems.

A Overdrive Technique Architecture for the Frame Memory Reduction based on DWT and Color Conversion (Frame Memory 축소를 위한 DWT와 Color Conversion 기반의 Overdrive 구조)

  • Byeon, Jin-Su;Kim, Hyeon-Seop;Kim, Do-Seok;Jeon, Eun-Seon;Hong, In-Seong;Kim, Bo-Gwan
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.1
    • /
    • pp.85-91
    • /
    • 2009
  • Recently, the LCD has high market share in TV market. The use of motion images in portable devices like DMB, PMP and Cell Phone is growing rapidly. One of the technique of enhancing the LCD's characteristic which is the slow response time. But, the technique requires a lot of memory usage, because of the requirement of frame memory. In this paper, we propose a reduction method for the frame memory that is required for LCD overdrive. Proposed overdrive architecture based on modified DWT-Inverse DWT and Color Conversion. The proposed architecture has a considerable PSNR. At once, it uses 50% of frame memory size and reduces 15% of frame memory size compare with previous architecture. The design was implemented using Xilinx Vertex4 and had 2172 Slice except Memory.

Memory-Efficient Belief Propagation for Stereo Matching on GPU (GPU 에서의 고속 스테레오 정합을 위한 메모리 효율적인 Belief Propagation)

  • Choi, Young-Kyu;Williem, Williem;Park, In Kyu
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.11a
    • /
    • pp.52-53
    • /
    • 2012
  • Belief propagation (BP) is a commonly used global energy minimization algorithm for solving stereo matching problem in 3D reconstruction. However, it requires large memory bandwidth and data size. In this paper, we propose a novel memory-efficient algorithm of BP in stereo matching on the Graphics Processing Units (GPU). The data size and transfer bandwidth are significantly reduced by storing only a part of the whole message. In order to maintain the accuracy of the matching result, the local messages are reconstructed using shared memory available in GPU. Experimental result shows that there is almost an order of reduction in the global memory consumption, and 21 to 46% saving in memory bandwidth when compared to the conventional algorithm. The implementation result on a recent GPU shows that we can obtain 22.8 times speedup in execution time compared to the execution on CPU.

  • PDF

Implementation of HMM Based Speech Recognizer with Medium Vocabulary Size Using TMS320C6201 DSP (TMS320C6201 DSP를 이용한 HMM 기반의 음성인식기 구현)

  • Jung, Sung-Yun;Son, Jong-Mok;Bae, Keun-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.1E
    • /
    • pp.20-24
    • /
    • 2006
  • In this paper, we focused on the real time implementation of a speech recognition system with medium size of vocabulary considering its application to a mobile phone. First, we developed the PC based variable vocabulary word recognizer having the size of program memory and total acoustic models as small as possible. To reduce the memory size of acoustic models, linear discriminant analysis and phonetic tied mixture were applied in the feature selection process and training HMMs, respectively. In addition, state based Gaussian selection method with the real time cepstral normalization was used for reduction of computational load and robust recognition. Then, we verified the real-time operation of the implemented recognition system on the TMS320C6201 EVM board. The implemented recognition system uses memory size of about 610 kbytes including both program memory and data memory. The recognition rate was 95.86% for ETRI 445DB, and 96.4%, 97.92%, 87.04% for three kinds of name databases collected through the mobile phones.

Code Size Reduction Through Efficient use of Multiple Load/store Instructions (복수의 메모리 접근 명령어의 효율적인 이용을 통한 코드 크기의 감소)

  • Ahn Minwook;Cho Doosan;Paek Yunheung;Cho Jeonghun
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.8
    • /
    • pp.819-833
    • /
    • 2005
  • Code size reduction is ever becoming more important for compilers targeting embedded processors because these processors are often severely limited by storage constraints and thus the reduced code size can have a positively significant Impact on their performance. Various code size reduction techniques have different motivations and a variety of application contexts utilizing special hardware features of their target processors. In this work, we propose a novel technique that fully utilizes a set of hardware instructions, called the multiple load/store (MLS), that are specially featured for reducing code size by minimizing the number of memory operations in the code. To take advantage of this feature, many microprocessors support the MLS instructions, whereas no existing compilers fully exploit the potential benefit of these instructions but only use them for some limited cases. This is mainly because optimizing memory accesses with MLS instructions for general cases is an NP-hard problem that necessitates complex assignments of registers and memory off-sets for variables in a stack frame. Our technique uses a couple of heuristics to efficiently handle this problem in a polynomial time bound.