• 제목/요약/키워드: reducing memory

Search Result 422, Processing Time 0.022 seconds

Real-Time Rule-Based System Architecture for Context-Aware Computing (실시간 상황 인식을 위한 하드웨어 룰-베이스 시스템의 구조)

  • Lee, Seung-Wook;Kim, Jong-Tae;Sohn, Bong-Ki;Lee, Keon-Myung;Cho, Jun-Dong;Lee, Jee-Hyung;Jeon, Jae-Wook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.5
    • /
    • pp.587-592
    • /
    • 2004
  • Context-aware computing systems require real-time context reasoning process for context awareness. Context reasoning can be done by comparing input information from sensors with knowledge-base within system. This method is identical with it of rule-based systems. In this paper, we propose hardware rule-based system architecture which can process context reasoning in real-time. Compared to previous architecture, hardware rule-based system architecture can reduce the number of constraints on rule representations and combinations of condition terms in rules. The modified content addressable memory, crossbar switch network and pre-processing module are used for reducing constraints. Using SystemC for description can provide easy modification of system configuration later.

Performance Comparison of Synchronization Methods for CC-NUMA Systems (CC-NUMA 시스템에서의 동기화 기법에 대한 성능 비교)

  • Moon, Eui-Sun;Jhang, Seong-Tae;Jhon, Chu-Shik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.4
    • /
    • pp.394-400
    • /
    • 2000
  • The main goal of synchronization is to guarantee exclusive access to shared data and critical sections, and then it makes parallel programs work correctly and reliably. Exclusive access restricts parallelism of parallel programs, therefor efficient synchronization is essential to achieve high performance in shared-memory parallel programs. Many techniques are devised for efficient synchronization, which utilize features of systems and applications. This paper shows the simulation results that existing synchronization methods have inefficiency under CC-NUMA(Cache Coherent Non-Uniform Memory Access) system, and then compares the performance of Freeze&Melt synchronization that can remove the inefficiency. The simulation results present that Test-and-Test&Set synchronization has inefficiency caused by broadcast operation and the pre-defined order of Queue-On-Lock-Bit (QOLB) synchronization to execute a critical section causes inefficiency. Freeze&Melt synchronization, which removes these inefficiencies, has performance gain by decreasing the waiting time to execute a critical section and the execution time of a critical section, and by reducing the traffic between clusters.

  • PDF

A Memory-based Reasoning Algorithm using Adaptive Recursive Partition Averaging Method (적응형 재귀 분할 평균법을 이용한 메모리기반 추론 알고리즘)

  • 이형일;최학윤
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.478-487
    • /
    • 2004
  • We had proposed the RPA(Recursive Partition Averaging) method in order to improve the storage requirement and classification rate of the Memory Based Reasoning. That algorithm worked not bad in many area, however, the major drawbacks of RPA are it's partitioning condition and the way of extracting major patterns. We propose an adaptive RPA algorithm which uses the FPD(feature-based population densimeter) to stop the ARPA partitioning process and produce, instead of RPA's averaged major pattern, optimizing resulting hyperrectangles. The proposed algorithm required only approximately 40% of memory space that is needed in k-NN classifier, and showed a superior classification performance to the RPA. Also, by reducing the number of stored patterns, it showed an excellent results in terms of classification when we compare it to the k-NN.

Design of an Area-Efficient Architecture for Block-wise MAP Turbo Decoder (면적 효율적인 구조의 블록 MAP 터보 복호기 설계)

  • Kang, Moon-Jun;Kim, Sik;Hwang, Sun-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.8A
    • /
    • pp.725-732
    • /
    • 2002
  • Block-wise MAP (Maximum A posteriori) decoding algorithm for turbo-codes requires less memory than Log-MAP decoding algorithm. The ER (Bit Error Rate) performance of previous block-wise MAP decoding algorithm depend on the block length and training length. To maximize hardware utilization and perform successive decoding, the block length is set to be equal to the training length in previous MAP decoding algorithms. Simulation result on the BER performance shows that the EBR performance can be maintained with shorter blocks when training length is sufficient. This paper proposes an architecture for area efficient block-wise MAP decoder. The proposed architecture employs the decoding schema for reducing memory by using the training length, which in N times larger than block length. To efficiently handle the proposed schema, a pipelined architecture is proposed. Simulation results show that memory usage can be reduced by 30%~45% in the proposed architecture without degrading the BER performance.

Adaptive-length pendulum smart tuned mass damper using shape-memory-alloy wire for tuning period in real time

  • Pasala, Dharma Theja Reddy;Nagarajaiah, Satish
    • Smart Structures and Systems
    • /
    • v.13 no.2
    • /
    • pp.203-217
    • /
    • 2014
  • Due to the shift in paradigm from passive control to adaptive control, smart tuned mass dampers (STMDs) have received considerable attention for vibration control in tall buildings and bridges. STMDs are superior to tuned mass dampers (TMDs) in reducing the response of the primary structure. Unlike TMDs, STMDs are capable of accommodating the changes in primary structure properties, due to damage or deterioration, by tuning in real time based on a local feedback. In this paper, a novel adaptive-length pendulum (ALP) damper is developed and experimentally verified. Length of the pendulum is adjusted in real time using a shape memory alloy (SMA) wire actuator. This can be achieved in two ways i) by changing the amount of current in the SMA wire actuator or ii) by changing the effective length of current carrying SMA wire. Using an instantaneous frequency tracking algorithm, the dominant frequency of the structure can be tracked from a local feedback signal, then the length of pendulum is adjusted to match the dominant frequency. Effectiveness of the proposed ALP-STMD mechanism, combined with the STFT frequency tracking control algorithm, is verified experimentally on a prototype two-storey shear frame. It has been observed through experimental studies that the ALP-STMD absorbs most of the input energy associated in the vicinity of tuned frequency of the pendulum damper. The reduction of storey displacements up to 80 % when subjected to forced excitation (harmonic and chirp-signal) and a faster decay rate during free vibration is observed in the experiments.

Efficient Hyperplane Generation Techniques for Human Activity Classification in Multiple-Event Sensors Based Smart Home (다중 이벤트 센서 기반 스마트 홈에서 사람 행동 분류를 위한 효율적 의사결정평면 생성기법)

  • Chang, Juneseo;Kim, Boguk;Mun, Changil;Lee, Dohyun;Kwak, Junho;Park, Daejin;Jeong, Yoosoo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.14 no.5
    • /
    • pp.277-286
    • /
    • 2019
  • In this paper, we propose an efficient hyperplane generation technique to classify human activity from combination of events and sequence information obtained from multiple-event sensors. By generating hyperplane efficiently, our machine learning algorithm classify with less memory and run time than the LSVM (Linear Support Vector Machine) for embedded system. Because the fact that light weight and high speed algorithm is one of the most critical issue in the IoT, the study can be applied to smart home to predict human activity and provide related services. Our approach is based on reducing numbers of hyperplanes and utilizing robust string comparing algorithm. The proposed method results in reduction of memory consumption compared to the conventional ML (Machine Learning) algorithms; 252 times to LSVM and 34,033 times to LSTM (Long Short-Term Memory), although accuracy is decreased slightly. Thus our method showed outstanding performance on accuracy per hyperplane; 240 times to LSVM and 30,520 times to LSTM. The binarized image is then divided into groups, where each groups are converted to binary number, in order to reduce the number of comparison done in runtime process. The binary numbers are then converted to string. The test data is evaluated by converting to string and measuring similarity between hyperplanes using Levenshtein algorithm, which is a robust dynamic string comparing algorithm. This technique reduces runtime and enables the proposed algorithm to become 27% faster than LSVM, and 90% faster than LSTM.

An Efficient Resource Optimization Method for Provisioning on Flash Memory-Based Storage (플래시 메모리 기반 저장장치에서 프로비저닝을 위한 효율적인 자원 최적화 기법)

  • Hyun-Seob Lee
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.4
    • /
    • pp.9-14
    • /
    • 2023
  • Recently, resource optimization research has been actively conducted in enterprises and data centers to manage the rapid growth of big data. In particular, thin provisioning, which allocates a large number of resources compared to fixedly allocated storage resources, has the effect of reducing initial costs, but as the number of resources actually used increases, the cost effectiveness decreases and the management cost for allocating resources increases. In this paper, we propose a technique that divides the physical blocks of flash memory into single-bit cells and multi-bit cells, formats them with a hybrid technique, and manages them by dividing frequently used hot data and infrequently used cold data. The proposed technique has the advantage that the physical and allocated resources are the same, such as thick provisioning, and can be used without additional cost increase, and the underutilized resources can be managed in multi-bit cell blocks, such as thin provisioning, which can allocate more resources than typical storage devices. Finally, we estimated the resource optimization effectiveness of the proposed technique through experiments based on simulations.

A Fast Processor Architecture and 2-D Data Scheduling Method to Implement the Lifting Scheme 2-D Discrete Wavelet Transform (리프팅 스킴의 2차원 이산 웨이브릿 변환 하드웨어 구현을 위한 고속 프로세서 구조 및 2차원 데이터 스케줄링 방법)

  • Kim Jong Woog;Chong Jong Wha
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.4 s.334
    • /
    • pp.19-28
    • /
    • 2005
  • In this paper, we proposed a parallel fast 2-D discrete wavelet transform hardware architecture based on lifting scheme. The proposed architecture improved the 2-D processing speed, and reduced internal memory buffer size. The previous lifting scheme based parallel 2-D wavelet transform architectures were consisted with row direction and column direction modules, which were pair of prediction and update filter module. In 2-D wavelet transform, column direction processing used the row direction results, which were not generated in column direction order but in row direction order, so most hardware architecture need internal buffer memory. The proposed architecture focused on the reducing of the internal memory buffer size and the total calculation time. Reducing the total calculation time, we proposed a 4-way data flow scheduling and memory based parallel hardware architecture. The 4-way data flow scheduling can increase the row direction parallel performance, and reduced the initial latency of starting of the row direction calculation. In this hardware architecture, the internal buffer memory didn't used to store the results of the row direction calculation, while it contained intermediate values of column direction calculation. This method is very effective in column direction processing, because the input data of column direction were not generated in column direction order The proposed architecture was implemented with VHDL and Altera Stratix device. The implementation results showed overall calculation time reduced from $N^2/2+\alpha$ to $N^2/4+\beta$, and internal buffer memory size reduced by around $50\%$ of previous works.

CC-GiST: A Generalized Framework for Efficiently Implementing Arbitrary Cache-Conscious Search Trees (CC-GiST: 임의의 캐시 인식 검색 트리를 효율적으로 구현하기 위한 일반화된 프레임워크)

  • Loh, Woong-Kee;Kim, Won-Sik;Han, Wook-Shin
    • The KIPS Transactions:PartD
    • /
    • v.14D no.1 s.111
    • /
    • pp.21-34
    • /
    • 2007
  • According to recent rapid price drop and capacity growth of main memory, the number of applications on main memory databases is dramatically increasing. Cache miss, which means a phenomenon that the data required by CPU is not resident in cache and is accessed from main memory, is one of the major causes of performance degradation of main memory databases. Several cache-conscious trees have been proposed for reducing cache miss and making the most use of cache in main memory databases. Since each cache-conscious tree has its own unique features, more than one cache-conscious tree can be used in a single application depending on the application's requirement. Moreover, if there is no existing cache-conscious tree that satisfies the application's requirement, we should implement a new cache-conscious tree only for the application's sake. In this paper, we propose the cache-conscious generalized search tree (CC-GiST). The CC-GiST is an extension of the disk-based generalized search tree (GiST) [HNP95] to be tache-conscious, and provides the entire common features and algorithms in the existing cache-conscious trees including pointer compression and key compression techniques. For implementing a cache-conscious tree based on the CC-GiST proposed in this paper, one should implement only a few functions specific to the cache-conscious tree. We show how to implement the most representative cache-conscious trees such as the CSB+-tree, the pkB-tree, and the CR-tree based on the CC-GiST. The CC-GiST eliminates the troublesomeness caused by managing mire than one cache-conscious tree in an application, and provides a framework for efficiently implementing arbitrary cache-conscious trees with new features.

An Efficient MBR Compression Technique for Main Memory Multi-dimensional Indexes (메인 메모리 다차원 인덱스를 위한 효율적인 MBR 압축 기법)

  • Kim, Joung-Joon;Kang, Hong-Koo;Kim, Dong-Oh;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.9 no.2
    • /
    • pp.13-23
    • /
    • 2007
  • Recently there is growing Interest in LBS(Location Based Service) requiring real-time services and the spatial main memory DBMS for efficient Telematics services. In order to optimize existing disk-based multi-dimensional Indexes of the spatial main memory DBMS in the main memory, multi-dimensional index structures have been proposed, which minimize failures in cache access by reducing the entry size. However, because the reduction of entry size requires compression based on the MBR of the parent node or the removal of redundant MBR, the cost of MBR reconstruction increases in index update and the efficiency of search is lowered in index search. Thus, to reduce the cost of MBR reconstruction, this paper proposed the RSMBR(Relative-Sized MBR) compression technique, which applies the base point of compression differently in case of broad distribution and narrow distribution. In case of broad distribution, compression is made based on the left-bottom point of the extended MBR of the parent node, and in case of narrow distribution, the whole MBR is divided into cells of the same size and compression is made based on the left-bottom point of each cell. In addition, MBR was compressed using a relative coordinate and size to reduce the cost of search in index search. Lastly, we evaluated the performance of the proposed RSMBR compression technique using real data, and proved its superiority.

  • PDF