• Title/Summary/Keyword: memory optimization

Search Result 362, Processing Time 0.027 seconds

Optimization of Controller Parameters using A Memory Cell of Immune Algorithm (면역알고리즘의 기억세포를 이용한 제어기 파라메터의 최적화)

  • Park, Jin-Hyeon;Choe, Yeong-Gyu
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.51 no.8
    • /
    • pp.344-351
    • /
    • 2002
  • The proposed immune algorithm has an uncomplicated structure and memory-cell mechanism as the optimization algorithm which imitates the principle of humoral immune response. We use the proposed algorithm to solve parameter optimization problems. Up to now, the applications of immune algorithm have been optimization problems with non-varying system parameters. Therefore the usefulness of memory-cell mechanism in immune algorithm is without. This paper proposes the immune algorithm using a memory-cell mechanism which can be the application of system with nonlinear varying parameters. To verified performance of the proposed immune algorithm, the speed control of nonlinear DC motor are performed. The results of Computer simulations represent that the proposed immune algorithm shows a fast convergence speed and a good control performances under the varying system parameters.

MODIFIED LIMITED MEMORY BFGS METHOD WITH NONMONOTONE LINE SEARCH FOR UNCONSTRAINED OPTIMIZATION

  • Yuan, Gonglin;Wei, Zengxin;Wu, Yanlin
    • Journal of the Korean Mathematical Society
    • /
    • v.47 no.4
    • /
    • pp.767-788
    • /
    • 2010
  • In this paper, we propose two limited memory BFGS algorithms with a nonmonotone line search technique for unconstrained optimization problems. The global convergence of the given methods will be established under suitable conditions. Numerical results show that the presented algorithms are more competitive than the normal BFGS method.

Automated optimization for memory-efficient high-performance deep neural network accelerators

  • Kim, HyunMi;Lyuh, Chun-Gi;Kwon, Youngsu
    • ETRI Journal
    • /
    • v.42 no.4
    • /
    • pp.505-517
    • /
    • 2020
  • The increasing size and complexity of deep neural networks (DNNs) necessitate the development of efficient high-performance accelerators. An efficient memory structure and operating scheme provide an intuitive solution for high-performance accelerators along with dataflow control. Furthermore, the processing of various neural networks (NNs) requires a flexible memory architecture, programmable control scheme, and automated optimizations. We first propose an efficient architecture with flexibility while operating at a high frequency despite the large memory and PE-array sizes. We then improve the efficiency and usability of our architecture by automating the optimization algorithm. The experimental results show that the architecture increases the data reuse; a diagonal write path improves the performance by 1.44× on average across a wide range of NNs. The automated optimizations significantly enhance the performance from 3.8× to 14.79× and further provide usability. Therefore, automating the optimization as well as designing an efficient architecture is critical to realizing high-performance DNN accelerators.

Parallel Topology Optimization on Distributed Memory System (분산 메모리 시스템에서의 병렬 위상 최적설계)

  • Lee Ki-Myung;Cho Seon-Ho
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2006.04a
    • /
    • pp.291-298
    • /
    • 2006
  • A parallelized topology design optimization method is developed on a distributed memory system. The parallelization is based on a domain decomposition method and a boundary communication scheme. For the finite element analysis of structural responses and design sensitivities, the PCG method based on a Krylov iterative scheme is employed. Also a parallelized optimization method of optimality criteria is used to solve large-scale topology optimization problems. Through several numerical examples, the developed method shows efficient and acceptable topology optimization results for the large-scale problems.

  • PDF

Compiler Optimization Techniques for The Next Generation Low Power Multibank Memory (차세대 저전력 멀티뱅크 메모리를 위한 컴파일러 최적화 기법)

  • Cho, Doosan
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.6
    • /
    • pp.141-145
    • /
    • 2021
  • Various types of memory architectures have been developed, and various compiler optimization techniques have been studied to efficiently use them. In particular, since a memory is a major component that determines performance in mobile computing devices, various optimization techniques have been developed to support them. Recently, a lot of research on hybrid type memory architecture is being conducted, so various compiler techniques are being studied to support it. Existing compiler optimization techniques can be used to achieve the required minimum performance and constraint on low power according to market requirements. References for determining the low-power effect and the degree of performance improvement using these optimization techniques are not properly provided yet. This study was conducted to provide the experimental results of the existing compiler technique as a reference for the development of multibank memory architecture.

A CLASS OF NONMONOTONE SPECTRAL MEMORY GRADIENT METHOD

  • Yu, Zhensheng;Zang, Jinsong;Liu, Jingzhao
    • Journal of the Korean Mathematical Society
    • /
    • v.47 no.1
    • /
    • pp.63-70
    • /
    • 2010
  • In this paper, we develop a nonmonotone spectral memory gradient method for unconstrained optimization, where the spectral stepsize and a class of memory gradient direction are combined efficiently. The global convergence is obtained by using a nonmonotone line search strategy and the numerical tests are also given to show the efficiency of the proposed algorithm.

Immune Algorithm Controller Design of DC Motor with parameters variation (DC 모터 파라메터 변동에 대한 면역 알고리즘 제어기 설계)

  • 박진현;전향식;이민중;김현식;최영규
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.05a
    • /
    • pp.175-178
    • /
    • 2002
  • The proposed immune algorithm has an uncomplicated structure and memory-cell mechanism as the optimization algorithm which imitates the principle of humoral immune response, and has been used as methods to solve parameter optimization problems. Up to now, the applications of immune algorithm have been optimization problems with non-varying system parameters. Therefore, the effect of memory-cell mechanism, which is a merit of immune algorithm, is without. this paper proposes the immune algorithm using a memory-cell mechanism which can be the application of system with nonlinear varying parameters. To verified performance of the proposed immune algorithm, the speed control of nonlinear DC motor are performed. Computer simulation studies show that the proposed immune algorithm has a fast convergence speed and a good control performances under the varying system parameters.

  • PDF

A Case Study of a Navigator Optimization Process

  • Cho, Doosan
    • International journal of advanced smart convergence
    • /
    • v.6 no.1
    • /
    • pp.26-31
    • /
    • 2017
  • When mobile navigator device accesses data randomly, the cache memory performance is rapidly deteriorated due to low memory access locality. For instance, GPS (General Positioning System) of navigator program for automobiles or drones, that are currently in common use, uses data from 32 satellites and computes current position of a receiver. This computation of positioning is the major part of GPS which accounts more than 50% computation in the program. In this computation task, the satellite signals are received in real time and stored in buffer memories. At this task, since necessary data cannot be sequentially stored, the data is read and used at random. This data accessing patterns are generated randomly, thus, memory system performance is worse by low data locality. As a result, it is difficult to process data in real time due to low data localization. Improving the low memory access locality inherited on the algorithms of conventional communication applications requires a certain optimization technique to solve this problem. In this study, we try to do optimizations with data and memory to improve the locality problem. In experiment, we show that our case study can improve processing speed of core computation and improve our overall system performance by 14%.

Memory Hierarchy Optimization in Embedded Systems using On-Chip SRAM (On-Chip SRAM을 이용한 임베디드 시스템 메모리 계층 최적화)

  • Kim, Jung-Won;Kim, Seung-Kyun;Lee, Jae-Jin;Jung, Chang-Hee;Woo, Duk-Kyun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.2
    • /
    • pp.102-110
    • /
    • 2009
  • The memory wall is the growing disparity of speed between CPU and memory outside the CPU chip. An economical solution is a memory hierarchy organized into several levels, such as processor registers, cache, main memory, disk storage. We introduce a novel memory hierarchy optimization technique in Linux based embedded systems using on-chip SRAM for the first time. The optimization technique allocates On-Chip SRAM to the code/data that selected by programmers by using virtual memory systems. Experiments performed with nine applications indicate that the runtime improvements can be achieved by up to 35%, with an average of 14%, and the energy consumption can be reduced by up to 40%, with an average of 15%.

Algorithmic GPGPU Memory Optimization

  • Jang, Byunghyun;Choi, Minsu;Kim, Kyung Ki
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.14 no.4
    • /
    • pp.391-406
    • /
    • 2014
  • The performance of General-Purpose computation on Graphics Processing Units (GPGPU) is heavily dependent on the memory access behavior. This sensitivity is due to a combination of the underlying Massively Parallel Processing (MPP) execution model present on GPUs and the lack of architectural support to handle irregular memory access patterns. Application performance can be significantly improved by applying memory-access-pattern-aware optimizations that can exploit knowledge of the characteristics of each access pattern. In this paper, we present an algorithmic methodology to semi-automatically find the best mapping of memory accesses present in serial loop nest to underlying data-parallel architectures based on a comprehensive static memory access pattern analysis. To that end we present a simple, yet powerful, mathematical model that captures all memory access pattern information present in serial data-parallel loop nests. We then show how this model is used in practice to select the most appropriate memory space for data and to search for an appropriate thread mapping and work group size from a large design space. To evaluate the effectiveness of our methodology, we report on execution speedup using selected benchmark kernels that cover a wide range of memory access patterns commonly found in GPGPU workloads. Our experimental results are reported using the industry standard heterogeneous programming language, OpenCL, targeting the NVIDIA GT200 architecture.