• Title/Summary/Keyword: Memory reduction

Search Result 472, Processing Time 0.022 seconds

An Efficient Multidimensional Scaling Method based on CUDA and Divide-and-Conquer (CUDA 및 분할-정복 기반의 효율적인 다차원 척도법)

  • Park, Sung-In;Hwang, Kyu-Baek
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.427-431
    • /
    • 2010
  • Multidimensional scaling (MDS) is a widely used method for dimensionality reduction, of which purpose is to represent high-dimensional data in a low-dimensional space while preserving distances among objects as much as possible. MDS has mainly been applied to data visualization and feature selection. Among various MDS methods, the classical MDS is not readily applicable to data which has large numbers of objects, on normal desktop computers due to its computational complexity. More precisely, it needs to solve eigenpair problems on dissimilarity matrices based on Euclidean distance. Thus, running time and required memory of the classical MDS highly increase as n (the number of objects) grows up, restricting its use in large-scale domains. In this paper, we propose an efficient approximation algorithm for the classical MDS based on divide-and-conquer and CUDA. Through a set of experiments, we show that our approach is highly efficient and effective for analysis and visualization of data consisting of several thousands of objects.

A Study on Low-Current-Operation of 850nm Oxide VCSELs Using a Large-Signal Circuit Model (대신호 등가회로 모델을 이용한 850nm Oxide VCSEL의 저전류 동작 특성 연구)

  • Jang, Min-Woo;Kim, Sang-Bae
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.10 s.352
    • /
    • pp.10-21
    • /
    • 2006
  • We have studied the characteristics of oxide VCSELS when their off-current and on-current are kept small in order to find out the possibility of low current operation. A large signal equivalent circuit model has been used. By comparing measured data and simulation results, the parameters of the large signal models are obtained including the capacitances. Using the large signal model, we have investigated the effects of capacitance and on/off currents upon the turn-on/turn-off characteristics and eye diagram. According to the experiment and simulation, the depletion capacitance, which has been neglected, is found to have significant influence on the him-on delay and eye-diagram. Therefore, for high speed and low current operation, the reduction of the depletion capacitance is essential.

Low Power Architecture of FIR Filter for 2D Image Filter (2D Image Filter에 적합한 저전력 FIR Filter의 구현)

  • Han, Chang-Yeong;Park, Hyeong-Jun;Kim, Lee-Seop
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.38 no.9
    • /
    • pp.663-670
    • /
    • 2001
  • This paper proposes a new power reduction method for 2D FIR (Finite Impulse Response) filters. We exploited the spatial redundancy of image data in order to reduce power dissipation in multiplication of FIR filters. Since the higher bits of input pixels are hardly changed, the redundant multiplication of higher bits is avoided by separating multiplication into higher and lower parts. The calculated values of higher bits are stored in memory cells, cache such that they can be reused when a cache hit occurs. Therefore, we can reduce power in 2D FIR Filter modules about 15% by using the proposed separated multiplication Technique (SMT).

  • PDF

Adaptive Frequency Scaling for Efficient Power Management in Pipelined Deep Packet Inspection Systems (파이프라인형 DPI 시스템에서 효율적인 소비전력 감소를 위한 동작주파수 설계방법)

  • Kim, Han-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.12
    • /
    • pp.133-141
    • /
    • 2014
  • An efficient method for reducing power consumption in pipelined deep packet inspection systems is proposed. It is based on the observation that the number of memory accesses is dominant for the power consumption and the number of accesses drops drastically as the input goes through stages of the pipelined AC-DFA. A DPI system is implemented where the operating frequency of the stages that are not frequently used in the pipeline is reduced to eliminate the waste of power consumption. The power consumption of the proposed DPI system is measured upon various input character set and up to 25% of reduction of total power consumption is obtained, compared to those of the recent DPI systems. The method can be easily applied to other pipelined architecture and string searching applications.

Data Reusable Search Scan Methods for Low Power motion Estimation (저전력 움직임 추정을 위한 데이터 재사용 스캔 방법)

  • Kim, Tae Sun;SunWoo, Myung Hoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.9
    • /
    • pp.85-91
    • /
    • 2013
  • This paper proposes the data reusable search scan methods for full search and fast search to implement low power Motion Estimation (ME). The proposed Optimized Sub-region Partitioning (OSP) method which divide search region into several sub-region can reduce the number of the required Reconfigurable Register Array (RRA) by half compared to the existing smart snake scan method for the same data reusability. In addition, the proposed Center Biased Search Scan method (CBSS) for various fast search algorithms can improve the data reusability. The performance comparisons show that the proposed search scan methods can reduce the average redundant data loading about 26.9% and 16.1% compared with the existing rater scan and snake scan methods, respectively. Due to the reduction of memory accesses, the proposed search scan methods are quite suitable for low power and high performance ME implementation.

A Hybrid Model of Network Intrusion Detection System : Applying Packet based Machine Learning Algorithm to Misuse IDS for Better Performance (Misuse IDS의 성능 향상을 위한 패킷 단위 기계학습 알고리즘의 결합 모형)

  • Weon, Ill-Young;Song, Doo-Heon;Lee, Chang-Hoon
    • The KIPS Transactions:PartC
    • /
    • v.11C no.3
    • /
    • pp.301-308
    • /
    • 2004
  • Misuse IDS is known to have an acceptable accuracy but suffers from high rates of false alarms. We show a behavior based alarm reduction with a memory-based machine learning technique. Our extended form of IBL, (XIBL) examines SNORT alarm signals if that signal is worthy sending signals to security manager. An experiment shows that there exists an apparent difference between true alarms and false alarms with respect to XIBL behavior This gives clear evidence that although an attack in the network consists of a sequence of packets, decisions over Individual packet can be used in conjunction with misuse IDS for better performance.

Optimization of Parallel Code for Noise Prediction in an Axial Fan Using MPI One-Sided Communication (MPI 일방향통신을 이용한 축류 팬 주위 소음해석 병렬프로그램 최적화)

  • Kwon, Oh-Kyoung;Park, Keuntae;Choi, Haecheon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.7 no.3
    • /
    • pp.67-72
    • /
    • 2018
  • Recently, noise reduction in an axial fan producing the small pressure rise and large flow rate, which is one type of turbomachine, is recognized as essential. This study describes the design and optimization techniques of MPI parallel program to simulate the flow-induced noise in the axial fan. In order to simulate the code using 100 million number of grids for flow and 70,000 points for noise sources, we parallelize it using the 2D domain decomposition. However, when it is involved many computing cores, it is getting slower because of MPI communication overhead among nodes, especially for the noise simulation. Thus, it is adopted the one-sided communication to reduce the overhead of MPI communication. Moreover, the allocated memory and communication between cores are optimized, thereby improving 2.97x compared to the original one. Finally, it is achieved 12x and 6x faster using 6,144 and 128 computing cores of KISTI Tachyon2 than using 256 and 16 computing cores for the flow and noise simulations, respectively.

A new design method of m-bit parallel BCH encoder (m-비트 병렬 BCH 인코더의 새로운 설계 방법)

  • Lee, June;Woo, Choong-Chae
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.11 no.3
    • /
    • pp.244-249
    • /
    • 2010
  • The design of error correction code with low complexity has a good attraction for next generation multi-level cell flash memory. Sharing sub-expressions is effective method to reduce complexity and chip size. This paper proposes a new design method of m-bit parallel BCH encoder based on serial linear feedback shift register structure with low complexity using sub-expression. In addition, general algorithm for obtaining the sub-expression is introduced. The sub-expression can be expressed by matrix operation between sub-matrix of generator matrix and sum of two different variables. The number of the sub-expression is restricted by. The obtained sub-expressions can be shared for implementation of different m-parallel BCH encoder. This paper is not focused on solving a problem (delay) induced by numerous fan-out, but complexity reduction, expecially the number of gates.

Performance Improvement of Nearest-neighbor Classification Learning through Prototype Selections (프로토타입 선택을 이용한 최근접 분류 학습의 성능 개선)

  • Hwang, Doo-Sung
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.2
    • /
    • pp.53-60
    • /
    • 2012
  • Nearest-neighbor classification predicts the class of an input data with the most frequent class among the near training data of the input data. Even though nearest-neighbor classification doesn't have a training stage, all of the training data are necessary in a predictive stage and the generalization performance depends on the quality of training data. Therefore, as the training data size increase, a nearest-neighbor classification requires the large amount of memory and the large computation time in prediction. In this paper, we propose a prototype selection algorithm that predicts the class of test data with the new set of prototypes which are near-boundary training data. Based on Tomek links and distance metric, the proposed algorithm selects boundary data and decides whether the selected data is added to the set of prototypes by considering classes and distance relationships. In the experiments, the number of prototypes is much smaller than the size of original training data and we takes advantages of storage reduction and fast prediction in a nearest-neighbor classification.

A Low Power ECC H-matrix Optimization Method using an Ant Colony Optimization (ACO를 이용한 저전력 ECC H-매트릭스 최적화 방안)

  • Lee, Dae-Yeal;Yang, Myung-Hoon;Kim, Yong-Joon;Park, Young-Kyu;Yoon, Hyun-Jun;Kang, Sung-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.1
    • /
    • pp.43-49
    • /
    • 2008
  • In this paper, a method using the Ant Colony Optimization(ACO) is proposed for reducing the power consumption of memory ECC checker circuitry which provide Single-Error Correcting and Double-Error Detecting(SEC-DED). The H-matrix which is used to generate SEC-DED codes is optimized to provide the minimum switching activity with little to no impact on area or delay using the symmetric property and degrees of freedom in constructing H-matrix of Hsiao codes. Experiments demonstrate that the proposed method can provide further reduction of power consumption compared with the previous works.