• Title/Summary/Keyword: memory efficiency

Search Result 709, Processing Time 0.021 seconds

Effect of Microkernel Structure on Cache Memory Performance (마이크로커널 구조가 캐시 메모리의 성능에 미치는 영향)

  • Chang, Moon-Seok;Koh, Kern
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.1
    • /
    • pp.68-80
    • /
    • 2000
  • The modern software technology toward modularization has changed the cache accessing behavior dramatically. Many modern operating systems are also departing from the past monolithic structure toward the highly modularized structure referred to as microkernel. Microkernel-based operating systems are more portable and extensible, but are likely to have worse performance. This paper quantitatively analyzes the effect of microkernel structure on cache memory to identify the primary factor for its performance degradation. Through the experiment performed on a Intel Pentium Pro processor platform, we found that the microkernel structure suffers from remarkably higher misses for L1, L2 cache and TLB than the monolithic one does. We also found that the performance of a microkernel is more dependent on the efficiency of cache memory than IPC. Finally, we found that these results come from the effect of frequent context switches mainly caused by the structural feature of a microkernel.

  • PDF

CMAC Learning Controller Implementation With Multiple Sampling Rate: An Inverted Pendulum Example (다중 샘플링 타임을 갖는 CMAC 학습 제어기 실현: 역진자 제어)

  • Lee, Byoung-Soo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.4
    • /
    • pp.279-285
    • /
    • 2007
  • The objective of the research is two fold. The first is to design and propose a stable and robust learning control algorithm. The controller is CMAC Learning Controller which consists of a model-based controller, such as LQR or PID, as a reference control and a CMAC. The second objective is to implement a reference control and CMAC at two different sampling rates. Generally, a conventional controller is designed based on a mathematical plant model. However, increasing complexity of the plant and accuracy requirement on mathematical models nearly prohibits the application of the conventional controller design approach. To avoid inherent complexity and unavoidable uncertainty in modeling, biology mimetic methods have been developed. One of such attempts is Cerebellar Model Articulation Computer(CMAC) developed by Albus. CMAC has two main disadvantages. The first disadvantage of CMAC is increasing memory requirement with increasing number of input variables and with increasing accuracy demand. The memory needs can be solved with cheap memories due to recent development of new memory technology. The second disadvantage is a demand for processing powers which could be an obstacle especially when CMAC should be implemented in real-time. To overcome the disadvantages of CMAC, we propose CMAC learning controller with multiple sampling rates. With this approach a conventional controller which is a reference to CMAC at high enough sampling rate but CMAC runs at the processor's unoccupied time. To show efficiency of the proposed method, an inverted pendulum controller is designed and implemented. We also demonstrate it's possibility as an industrial control solution and robustness against a modeling uncertainty.

Performance Analysis on Various Design Issues of Turbo Decoder (다양한 Design Issue에 대한 터보 디코더의 성능분석)

  • Park Taegeun;Kim Kiwhan
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12A
    • /
    • pp.1387-1395
    • /
    • 2004
  • Turbo decoder inherently requires large memory and intensive hardware complexity due to iterative decoding, despite of excellent decoding efficiency. To decrease the memory space and reduce hardware complexity, various design issues have to be discussed. In this paper, various design issues on Turbo decoder are investigated and the tradeoffs between the hardware complexity and the performance are analyzed. Through the various simulations on the fixed-length analysis, we decided 5-bits for the received data, 6-bits for a priori information, and 7-bits for the quantization state metric, so the performance gets close to that of infinite precision. The MAX operation which is the main function of Log-MAP decoding algorithm is analyzed and the error correction term for MAX* operation can be efficiently implemented with very small hardware overhead. The size of the sliding window was decided as 32 to reduce the state metric memory space and to achieve an acceptable BER.

Research on An Energy Efficient Triangular Shape Routing Protocol based on Clusters (클러스터에 기반한 에너지 효율적 삼각모양 라우팅 프로토콜에 관한 연구)

  • Nurhayati, Nurhayati;Lee, Kyung-Oh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.9
    • /
    • pp.115-122
    • /
    • 2011
  • In this paper, we propose an efficient dynamic workload balancing strategy which improves the performance of high-performance computing system. The key idea of this dynamic workload balancing strategy is to minimize execution time of each job and to maximize the system throughput by effectively using system resource such as CPU, memory. Also, this strategy dynamically allocates job by considering demanded memory size of executing job and workload status of each node. If an overload node occurs due to allocated job, the proposed scheme migrates job, executing in overload nodes, to another free nodes and reduces the waiting time and execution time of job by balancing workload of each node. Through simulation, we show that the proposed dynamic workload balancing strategy based on CPU, memory improves the performance of high-performance computing system compared to previous strategies.

Research for Efficient Massive File I/O on Parallel Programs (병렬 프로그램에서의 효율적인 대용량 파일 입출력 방식의 비교 연구)

  • Hwang, Gyuhyeon;Kim, Youngtae
    • Journal of Internet Computing and Services
    • /
    • v.18 no.2
    • /
    • pp.53-60
    • /
    • 2017
  • Since processors are handling inputs and outputs independently on distributed memory computers, different file input/output methods are used. In this paper, we implemented and compared various file I/O methods to show their efficiency on distributed memory parallel computers. The implemented I/O systems are as following: (i) parallel I/O using NFS, (ii) sequential I/O on the host processor and domain decomposition, (iii) MPI-IO. For performance analysis, we used a separated file server and multiple processors on one or two computational servers. The results show the file I/O with NFS for inputs and sequential output with domain composition for outputs are best efficient respectively. The MPI-IO result shows unexpectedly the lowest performance.

VLSI architecture design of CAVLC entropy encoder/decoder for H.264/AVC (H.264/AVC를 위한 CAVLC 엔트로피 부/복호화기의 VLSI 설계)

  • Lee Dae-joon;Jeong Yong-jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.5C
    • /
    • pp.371-381
    • /
    • 2005
  • In this paper, we propose an advanced hardware architecture for the CAVLC entropy encoder/decoder engine for real time video compression. The CAVLC (Context-based Adaptive Variable Length Coding) is a lossless compression method in H.264/AVC and it has high compression efficiency but has computational complexity. The reference memory size is optimized using partitioned storing method and memory reuse method which are based on partiality of memory referencing. We choose the hardware architecture which has the most suitable one in several encoder/decoder architectures for the mobile devices and improve its performance using parallel processing. The proposed architecture has been verified by ARM-interfaced emulation board using Altera Excalibur and also synthesized on Samsung 0.18 um CMOS technology. The synthesis result shows that the encoder can process about 300 CIF frames/s at 150MHz and the decoder can process about 250 CIF frames/s at 140Mhz. The hardware architectures are being used as core modules when implementing a complete H.264/AVC video encoder/decoder chip for real-time multimedia application.

Clustering Scheme using Memory Restriction for Wireless Sensor Network (무선센서네트워크에서 메모리 속성을 이용한 클러스터링 기법)

  • Choi, Hae-Won;Yoo, Kee-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.1B
    • /
    • pp.10-15
    • /
    • 2009
  • Recently, there are tendency that wireless sensor network is one of the important techniques for the future IT industry and thereby application areas in it are getting growing. Researches based on the hierarchical network topology are evaluated in good at energy efficiency in related protocols for wireless sensor network. LEACH is the best well known routing protocol for the hierarchical topology. However, there are problems in the range of message broadcasting, which should be expand into the overall network coverage, in LEACH related protocols. Thereby, this paper proposes a new clustering scheme to solve the co-shared problems in them. The basic idea of our scheme is using the inherent memory restrictions in sensor nodes. The results show that the proposed scheme could support the load balancing by distributing the clusters with a reasonable number of member nodes and thereby the network life time would be extended in about 1.8 times longer than LEACH.

Potential of Bidirectional Long Short-Term Memory Networks for Crop Classification with Multitemporal Remote Sensing Images

  • Kwak, Geun-Ho;Park, Chan-Won;Ahn, Ho-Yong;Na, Sang-Il;Lee, Kyung-Do;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.4
    • /
    • pp.515-525
    • /
    • 2020
  • This study investigates the potential of bidirectional long short-term memory (Bi-LSTM) for efficient modeling of temporal information in crop classification using multitemporal remote sensing images. Unlike unidirectional LSTM models that consider only either forward or backward states, Bi-LSTM could account for temporal dependency of time-series images in both forward and backward directions. This property of Bi-LSTM can be effectively applied to crop classification when it is difficult to obtain full time-series images covering the entire growth cycle of crops. The classification performance of the Bi-LSTM is compared with that of two unidirectional LSTM architectures (forward and backward) with respect to different input image combinations via a case study of crop classification in Anbadegi, Korea. When full time-series images were used as inputs for classification, the Bi-LSTM outperformed the other unidirectional LSTM architectures; however, the difference in classification accuracy from unidirectional LSTM was not substantial. On the contrary, when using multitemporal images that did not include useful information for the discrimination of crops, the Bi-LSTM could compensate for the information deficiency by including temporal information from both forward and backward states, thereby achieving the best classification accuracy, compared with the unidirectional LSTM. These case study results indicate the efficiency of the Bi-LSTM for crop classification, particularly when limited input images are available.

File Access Pattern Collection Scheme based on Repetitiveness (반복성을 고려한 파일 액세스 패턴 수집 기법)

  • Hwnag-Bo, Jun-Hyoung;Seok, Seong-U;Seo, Dae-Hwa
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.12
    • /
    • pp.674-684
    • /
    • 2001
  • This paper presents the SIC(Size-Interval-Count) prefetching scheme that can record the file access patterns of applications within a relatively small space of memory based on the repetitiveness of the file access patterns. Several knowledge-based prefetching methods were recently introduced, which includes high correctness in predicting future accesses of applications. They records the access patterns of applications and uses recorded access pattern information to predict which blocks will be requested next. Yet, these methods require to much memory space. Accordingly, the proposed method then uses the recorded file access patterns, referred to as "SIC access pattern information", to correctly predict the future accesses of the applications. The proposed prefetching method improved the response time by about 40% compared to the general file system and showed remarkable memory efficiency compared to the previously knowledge-based prefetching methods.

  • PDF

A Fast Coeff_token Decoding Method for Efficient Implimentation of H.264/AVC CAVLC Decoder (효율적인 H.264/AVC CAVLC 복호화기 구현을 위한 고속 Coeff_token 복원 방식)

  • Moon, Yong-Ho;Park, Tae-Hee
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.5
    • /
    • pp.35-42
    • /
    • 2008
  • In this paper, we propose a fast coeff_token decoding method based on the re-constructed VLCT. Since the conventional decoding method is still based on large memory accesses, it is not suitable for the multimedia services such as PMP, PMB, DVH-H where fast decoding and low power consumption are required. Based on the analysis for the codeword structure, new structure of the codeword and the corresponding memory architecture are developed in this paper. The simulation results show that the proposed algorithm achieves memory access saving from 10% to 57%, compared to the conventional decoding method. This meant that the issues of tow power consumption and high speed decoding can be resolved without video-quality and coding efficiency degradation.