• Title/Summary/Keyword: Memory reduction

Search Result 469, Processing Time 0.03 seconds

Data Prefetching Effect of the Stride Merging-Arrays Method (스트라이드 배열 병합 방법의 데이터 선인출 효과)

  • Jeong, In-Beom;Lee, Jun-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.11
    • /
    • pp.1429-1436
    • /
    • 1999
  • 데이타들에 대한 선인출 효과를 얻기 위하여 캐쉬 메모리의 캐쉬 블록은 다중 워드로 구성된다. 그러나 선인출된 데이타들이 사용되지 않을 경우 캐쉬 메모리가 낭비되고 따라서 캐쉬 실패율이 증가한다. 데이타 배열 병합 방법은 캐쉬 실패 원인의 하나인 캐쉬 충돌 실패를 감소시키기 위하여 사용되고 있다. 그러나 기존의 배열 병합 방법은 유용하지 못한 데이타들을 캐쉬 블록에 선인출하는 현상을 보인다. 본 논문에서는 이러한 현상을 개선한 스트라이드 배열 병합을 제안한다. 모의시험에서 캐쉬 블록이 다중 워드로 구성된 경우 스트라이드 배열 병합은 캐쉬 충돌 실패를 감소시킬 뿐 만 아니라 유용한 데이타 선인출을 증가 시키므로 캐쉬 성능을 향상시킴을 보여준다. 또한 이렇게 향상된 캐쉬 성능은 프로세서 증가에 따른 확장성 있는 프로그램 성능을 나타낸다.Abstract The cache memory is composed of cache lines with multiple words to achieve the effect of data prefetching. However, if the prefetched data are not used, the spaces of the cache memory are wasted and thus the cache miss rate increases. The data merging-arrays method is used for the sake of the reduction of the cache conflict misses. However, the existing merging-arrays method results in the useless data prefetching. In this paper, a stride merging-arrays method is suggested for improving this phenomenon. Simulation results show that when a cache line is composed of multiple words, the stride merging-arrays method increases the cache performance due to not only the reduction of cache conflict misses but also the useful data prefetching. This enhanced cache performance also represents the more scalable performance of parallel applications according to increasing the number of processors.

Design of Memory-Efficient Deterministic Finite Automata by Merging States With The Same Input Character (동일한 입력 문자를 가지는 상태의 병합을 통한 메모리 효율적인 결정적 유한 오토마타 구현)

  • Choi, Yoon-Ho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.3
    • /
    • pp.395-404
    • /
    • 2013
  • A pattern matching algorithm plays an important role in traffic identification and classification based on predefined patterns for intrusion detection and prevention. As attacks become prevalent and complex, current patterns are written using regular expressions, called regexes, which are expressed into the deterministic finite automata(DFA) due to the guaranteed worst-case performance in pattern matching process. Currently, because of the increased complexity of regex patterns and their large number, memory-efficient DFA from states reduction have become the mainstay of pattern matching process. However, most of the previous works have focused on reducing only the number of states on a single automaton, and thus there still exists a state blowup problem under the large number of patterns. To solve the above problem, we propose a new state compression algorithm that merges states on multiple automata. We show that by merging states with the same input character on multiple automata, the proposed algorithm can lead to a significant reduction of the number of states in the original DFA by as much as 40.0% on average.

Delayed Dual Buffering: Reducing Page Fault Latency in Demand Paging for OneNAND Flash Memory (지연 이중 버퍼링: OneNAND 플래시를 이용한 페이지 반입 비용 절감 기법)

  • Joo, Yong-Soo;Park, Jae-Hyun;Chung, Sung-Woo;Chung, Eui-Young;Chang, Nae-Hyuck
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.3 s.357
    • /
    • pp.43-51
    • /
    • 2007
  • OneNAND flash combines the advantages of NAND and NOR flash, and has become an alternative to the former. But the advanced features of OneNAND flash are not utilized effectively in demand paging systems designed for NAND flash. We propose delayed dual buffering, a demand paging system which fully exploits the random-access I/O interface and dual page buffers of OneNAND flash demand paging system. It effectively reduces the time of page transfer from the OneNAND page buffer to the main memory. On average, it achieves and 28.5% reduction in execution time and 4.4% reduction in paging system energy consumption.

Vibration control of small horizontal axis wind turbine blade with shape memory alloy

  • Mouleeswaran, Senthil Kumar;Mani, Yuvaraja;Keerthivasan, P.;Veeraragu, Jagadeesh
    • Smart Structures and Systems
    • /
    • v.21 no.3
    • /
    • pp.257-262
    • /
    • 2018
  • Vibrational problems in the domestic Small Horizontal Axis Wind Turbines (SHAWT) are due to flap wise vibrations caused by varying wind velocities acting perpendicular to its blade surface. It has been reported that monitoring the structural health of the turbine blades requires special attention as they are key elements of a wind power generation, and account for 15-20% of the total turbine cost. If this vibration problem is taken care, the SHAWT can be made as commercial success. In this work, Shape Memory Alloy (SMA) wires made of Nitinol (Ni-Ti) alloys are embedded into the Glass Fibre Reinforced Polymer (GFRP) wind turbine blade in order to reduce the flapwise vibrations. Experimental study of Nitinol (Ni-Ti) wire characteristics has been done and relationship between different parameters like current, displacement, time and temperature has been established. When the wind turbine blades are subjected to varying wind velocity, flapwise vibration occurs which has to be controlled continuously, otherwise the blade will be damaged due to the resonance. Therefore, in order to control these flapwise vibrations actively, a non-linear current controller unit was developed and fabricated, which provides actuation force required for active vibration control in smart blade. Experimental analysis was performed on conventional GFRP and smart blade, depicted a 20% increase in natural frequency and 20% reduction in amplitude of vibration. With addition of active vibration control unit, the smart blade showed 61% reduction in amplitude of vibration.

TinyECCK : Efficient Implementation of Elliptic Curve Cryptosystem over GF$(2^m)$ on 8-bit Micaz Mote (TinyECCK : 8 비트 Micaz 모트에서 GF$(2^m)$상의 효율적인 타원곡선 암호 시스템 구현)

  • Seo, Seog-Chung;Han, Dong-Guk;Hong, Seok-Hie
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.3
    • /
    • pp.9-21
    • /
    • 2008
  • In this paper, we revisit a generally accepted opinion: implementing Elliptic Curve Cryptosystem (ECC) over GF$(2^m)$ on sensor motes using small word size is not appropriate because partial XOR multiplication over GF$(2^m)$ is not efficiently supported by current low-powered microprocessors. Although there are some implementations over GF$(2^m)$ on sensor motes, their performances are not satisfactory enough due to the redundant memory accesses that result in inefficient field multiplication and reduction. Therefore, we propose some techniques for reducing unnecessary memory access instructions. With the proposed strategies, the running time of field multiplication and reduction over GF$(2^{163})$ can be decreased by 21.1% and 24.7%, respectively. These savings noticeably decrease execution times spent in Elliptic Curve Digital Signature Algorithm (ECDSA) operations (Signing and verification) by around $15{\sim}19%$.

A Novel CFR Algorithm using Histogram-based Code Domain Compensation Process for WCDMA Basestation (히스토그램 기반 코드 영역 보상 기법을 적용한 W-CDMA 기지국용 CFR 알고리즘)

  • Chang, Hyung-Min;Lee, Won-Cheol
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.12C
    • /
    • pp.1175-1187
    • /
    • 2007
  • This paper proposes a novel crest factor reduction (CFR) algorithm to be deployed on WCDMA basestation. Generally speaking, it is well described that the reduction of peak-to-average ratio (PAR) yields the possibility of using low cost power amplifier such that the basesation becomes economic However, the simple reduction of PAR could degrade the signal quality measured by either peak code domain error (PCDE) or error vector measurement (EVM), and the level of channel interference constrained by adjacent channel leakage ratio (ACLR). Regarding these imperfections, this paper introduces an effective CFR algorithm in which the function of filter-dependent CFR (FDCFR) incorporated with the histogram-based waterfilling code domain compensation (HBWCDC) carries out. To verify the performance of the proposed CFR technique, substantial simulations including comparative works are conducted with obeying W-CDMA basestation verification specification. To exploit the superiority, the performance of the proposed method is tentatively compared with that associated to the simple memoryless clipping method and the memory-required filter-dependent clipping method.

Design Challenges and Solutions for Ultra-High-Density Monolithic 3D ICs

  • Panth, Shreepad;Samal, Sandeep;Yu, Yun Seop;Lim, Sung Kyu
    • Journal of information and communication convergence engineering
    • /
    • v.12 no.3
    • /
    • pp.186-192
    • /
    • 2014
  • Monolithic three-dimensional integrated chips (3D ICs) are an emerging technology that offers an integration density that is some orders of magnitude higher than the conventional through-silicon-via (TSV)-based 3D ICs. This is due to a sequential integration process that enables extremely small monolithic inter-tier vias (MIVs). For a monolithic 3D memory, we first explore the static random-access memory (SRAM) design. Next, for digital logic, we explore several design styles. The first is transistor-level, which is a design style unique to monolithic 3D ICs that are enabled by the ultra-high-density of MIVs. We also explore gate-level and block-level design styles, which are available for TSV-based 3D ICs. For each of these design styles, we present techniques to obtain the graphic database system (GDS) layouts, and perform a signoff-quality performance and power analysis. We also discuss various challenges facing monolithic 3D ICs, such as achieving 50% footprint reduction over two-dimensional (2D) ICs, routing congestion, power delivery network design, and thermal issues. Finally, we present design techniques to overcome these challenges.

Current-Mode Circuit Design using Sub-threshold MOSFET (Sub-threshold MOSFET을 이용한 전류모드 회로 설계)

  • Cho, Seung-Il;Yeo, Sung-Dae;Lee, Kyung-Ryang;Kim, Seong-Kweon
    • Journal of Satellite, Information and Communications
    • /
    • v.8 no.3
    • /
    • pp.10-14
    • /
    • 2013
  • In this paper, when applying current-mode circuit design technique showing constant power dissipation none the less operation frequency, to the low power design of dynamic voltage frequency scaling, we introduce the low power current-mode circuit design technique applying MOSFET in sub-threshold region, in order to solve the problem that has large power dissipation especially on the condition of low operating frequency. BSIM 3, was used as a MOSFET model in circuit simulation. From the simulation result, the power dissipation of the current memory circuit with sub-threshold MOSFET showed $18.98{\mu}W$, which means the consumption reduction effect of 98%, compared with $900{\mu}W$ in that with strong inversion. It is confirmed that the proposed circuit design technique will be available in DVFS using a current-mode circuit design.

Design of a High Performance Two-Step SOVA Decoder (고성능 Two-Step SOVA 복호기 설계)

  • 전덕수
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.3
    • /
    • pp.384-389
    • /
    • 2003
  • A new two-step soft-output Viterbi algorithm (SOVA) decoder architecture is presented. A significant reduction in the decoding latency can be achieved through the use of the dual-port RAM in the survivor memory structure of the trace-back unit. The system complexity can be lowered due to the determination of the absolute value of the path metric differences inside the add-compare-select (ACS) unit. The proposed SOVA architecture was verified successfully by the functional simulation of Verilog HDL modeling and the FPGA prototyping. The SOVA decoder achieves a data rate very close to that of the conventional Viterbi Algorithm (VA) decoder and the resource consumption of the realized SOVA decoder is only one and a half times larger than that of the conventional VA decoder.

Parallel BCH Encoding/decoding Method and VLSI Design for Nonvolatile Memory (비휘발성 메모리를 위한 병렬 BCH 인코딩/디코딩 방법 및 VLSI 설계)

  • Lee, Sang-Hyuk;Baek, Kwang-Hyun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.5
    • /
    • pp.41-47
    • /
    • 2010
  • This paper has proposed parallel BCH, one of error correction coding methods which has been used to NAND flash memory for SSD(solid state disk). To alter error correction capability, the proposed design improved reliability on data block has higher error rate as used frequency increasingly. Decoding parallel process bit width is as two times as encoding parallel process bit width, that could reduce decoding processing time, accordingly resulting in one half reduction over conventional ECC.