• Title/Summary/Keyword: hardware optimization

Search Result 210, Processing Time 0.022 seconds

A Study on Scalability of Profiling Method Based on Hardware Performance Counter for Optimal Execution of Supercomputer (슈퍼컴퓨터 최적 실행 지원을 위한 하드웨어 성능 카운터 기반 프로파일링 기법의 확장성 연구)

  • Choi, Jieun;Park, Guenchul;Rho, Seungwoo;Park, Chan-Yeol
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.10
    • /
    • pp.221-230
    • /
    • 2020
  • Supercomputer that shares limited resources to multiple users needs a way to optimize the execution of application. For this, it is useful for system administrators to get prior information and hint about the applications to be executed. In most high-performance computing system operations, system administrators strive to increase system productivity by receiving information about execution duration and resource requirements from users when executing tasks. They are also using profiling techniques that generates the necessary information using statistics such as system usage to increase system utilization. In a previous study, we have proposed a scheduling optimization technique by developing a hardware performance counter-based profiling technique that enables characterization of applications without further understanding of the source code. In this paper, we constructed a profiling testbed cluster to support optimal execution of the supercomputer and experimented with the scalability of the profiling method to analyze application characteristics in the built cluster environment. Also, we experimented that the profiling method can be utilized in actual scheduling optimization with scalability even if the application class is reduced or the number of nodes for profiling is minimized. Even though the number of nodes used for profiling was reduced to 1/4, the execution time of the application increased by 1.08% compared to profiling using all nodes, and the scheduling optimization performance improved by up to 37% compared to sequential execution. In addition, profiling by reducing the size of the problem resulted in a quarter of the cost of collecting profiling data and a performance improvement of up to 35%.

Digital Twin-Based Communication Optimization Method for Mission Validation of Swarm Robot (군집 로봇의 임무 검증 지원을 위한 디지털 트윈 기반 통신 최적화 기법)

  • Gwanhyeok, Kim;Hanjin, Kim;Junhyung, Kwon;Beomsu, Ha;Seok Haeng, Huh;Jee Hoon, Koo;Ho Jung, Sohn;Won-Tae, Kim
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.1
    • /
    • pp.9-16
    • /
    • 2023
  • Robots are expected to expand their scope of application to the military field and take on important missions such as surveillance and enemy detection in the coming future warfare. Swarm robots can perform tasks that are difficult or time-consuming for a single robot to be performed more efficiently due to the advantage of having multiple robots. Swarm robots require mutual recognition and collaboration. So they send and receive vast amounts of data, making it increasingly difficult to verify SW. Hardware-in-the-loop simulation used to increase the reliability of mission verification enables SW verification of complex swarm robots, but the amount of verification data exchanged between the HILS device and the simulator increases exponentially according to the number of systems to be verified. So communication overload may occur. In this paper, we propose a digital twin-based communication optimization technique to solve the communication overload problem that occurs in mission verification of swarm robots. Under the proposed Digital Twin based Multi HILS Framework, Network DT can efficiently allocate network resources to each robot according to the mission scenario through the Network Controller algorithm, and can satisfy all sensor generation rates required by individual robots participating in the group. In addition, as a result of an experiment on packet loss rate, it was possible to reduce the packet loss rate from 15.7% to 0.2%.

Comparative Performance Analysis of High Speed Low Power Area Efficient FIR Adaptive Filter

  • Jaiswal, Manish
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.3 no.5
    • /
    • pp.267-270
    • /
    • 2014
  • This paper presents the comparative performance of an adaptive FIR filter for a Delayed LMS algorithm. The delayed error signal was used to obtain a Delayed LMS algorithm to allow efficient pipelining for achieving a small critical path and area efficient implementation. This paper presents hardware efficient results (device utilization parameters) and power consumed. The FPGA families (Artix-7, Virtex-7, and Kintex-7) for a low voltage perspective are shown. The synthesis results showed that the artix-7 CMOS family achieves the lowest power consumption of 1.118 mW with 83.18 % device utilization. Different Precision strategies, such as the speed optimization and power optimization, were imposed to achieve these results. The algorithm was implemented using MATLAB (2013b) and synthesized on the Leonardo spectrum.

An Artificial Neural Network for the Optimal Path Planning (최적경로탐색문제를 위한 인공신경회로망)

  • Kim, Wook;Park, Young-Moon
    • Proceedings of the KIEE Conference
    • /
    • 1991.07a
    • /
    • pp.333-336
    • /
    • 1991
  • In this paper, Hopfield & Tank model-like artificial neural network structure is proposed, which can be used for the optimal path planning problems such as the unit commitment problems or the maintenance scheduling problems which have been solved by the dynamic programming method or the branch and bound method. To construct the structure of the neural network, an energy function is defined, of which the global minimum means the optimal path of the problem. To avoid falling into one of the local minima during the optimization process, the simulated annealing method is applied via making the slope of the sigmoid transfer functions steeper gradually while the process progresses. As a result, computer(IBM 386-AT 34MHz) simulations can finish the optimal unit commitment problem with 10 power units and 24 hour periods (1 hour factor) in 5 minites. Furthermore, if the full parallel neural network hardware is contructed, the optimization time will be reduced remarkably.

  • PDF

Effective Variations of Simulated Annealing and Their Implementation for High Level Synthesis (Simulated Annealing 의 효과적 변형 및 HLS 에의 적용)

  • Yoon, B.S.;Song, N.U.
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.21 no.1
    • /
    • pp.33-49
    • /
    • 1995
  • Simulated annealing(SA) has been admitted as a general purpose optimization technique which can be utilized for almost all kinds of combinatorial optimization problems without much difficulty. But there are still some weak points to be resolved, one of which is the slow speed of convergence. In this study, we carefully review various previous efforts to improve SA and propose some variations of SA which can enhance the speed of convergence to the optimum solution. Then, we apply the revised SA algorithms to the scheduling and hardware allocation problems occurring in high-level synthesis(HLS) of VLSI design. We confirm the efficiency of the proposed methods through several HLS examples.

  • PDF

Optimal Transducer Positions of an Active Noise Control System with an Opening in an Enclosure (개구부를 가지는 실내의 능동소음제어시스템의 최적 트랜스듀서 위치)

  • 백광현
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.14 no.2
    • /
    • pp.157-163
    • /
    • 2004
  • Optimal transducer positions are important as much as the control algorithms and hardware performance in the active noise control system. This study is similar to the past researches on the optimal transducer locations but with a far field noise source having a plane wave characteristic and the noise coming through an opening such as a window in an enclosure. Optimization techniques are used to find sets of optimal loudspeaker positions from a larger possible loudspeaker positions. Loudspeakers are placed on the surface of opening at the wall and inside of the enclosure. Using the measured acoustic transfer impedances and numerical simulations with the optimization technique, optimal positions are identified and compared. When a small number of loudspeakers are used. loudspeaker positions on the opening near the center seems to be the best place, but when a larger number of loudspeakers are used it was difficult to find simple patterns in the optimal positions. With the optimally positioned loudspeakers, optimal microphone positions are also studied.

Optimal Loudspeaker Positions of an Active Noise Control System with an Opening in an Enclosure (개구부를 가지는 실내의 능동소음제어시스템에서의 최적스피커 위치)

  • 백광현
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2003.11a
    • /
    • pp.788-791
    • /
    • 2003
  • Optimal loudspeaker positions are important as much as the control algorithms and hardware performance in the active noise control system. This study is similar to the past researches on the optimal transducer locations but with a far field noise source having a plane wave characteristic and the noise coming through an opening such as a window in the enclosure. An optimization technique called simulated annealing algorithm is used to find a set of optimal loudspeaker positions from a larger possible loudspeaker positions. Loudspeakers are placed on the surface of opening at the wail. Using the measured acoustic transfer impedances and numerical simulations with the optimization technique, optimal positions we identified and compared. When a small number of loudspeakers are used, loudspeaker positions on the opening near the center seems to be the best place, but when a larger number of loudspeakers are used it was difficult to find simple patterns Un the optimal positions.

  • PDF

A Survey of Energy Efficiency Optimization in Heterogeneous Cellular Networks

  • Abdulkafi, Ayad A.;Kiong, Tiong S.;Sileh, Ibrahim K.;Chieng, David;Ghaleb, Abdulaziz
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.2
    • /
    • pp.462-483
    • /
    • 2016
  • The research on optimization of cellular network's energy efficiency (EE) towards environmental and economic sustainability has attracted increasing attention recently. In this survey, we discuss the opportunities, trends and challenges of this challenging topic. Two major contributions are presented namely 1) survey of proposed energy efficiency metrics; 2) survey of proposed energy efficient solutions. We provide a broad overview of the state of-the-art energy efficient methods covering base station (BS) hardware design, network planning and deployment, and network management and operation stages. In order to further understand how EE is assessed and improved through the heterogeneous network (HetNet), BS's energy-awareness and several typical HetNet deployment scenarios such as macrocell-microcell and macrocell-picocell are presented. The analysis of different HetNet deployment scenarios gives insights towards a successful deployment of energy efficient cellular networks.

TVM-based Performance Optimization for Image Classification in Embedded Systems (임베디드 시스템에서의 객체 분류를 위한 TVM기반의 성능 최적화 연구)

  • Cheonghwan Hur;Minhae Ye;Ikhee Shin;Daewoo Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.3
    • /
    • pp.101-108
    • /
    • 2023
  • Optimizing the performance of deep neural networks on embedded systems is a challenging task that requires efficient compilers and runtime systems. We propose a TVM-based approach that consists of three steps: quantization, auto-scheduling, and ahead-of-time compilation. Our approach reduces the computational complexity of models without significant loss of accuracy, and generates optimized code for various hardware platforms. We evaluate our approach on three representative CNNs using ImageNet Dataset on the NVIDIA Jetson AGX Xavier board and show that it outperforms baseline methods in terms of processing speed.

A design of Viterbi decoder for memory optimization (메모리 최적화를 위한 Viterbi 디코더의 설계)

  • 신동석;박종진김은원조원경
    • Proceedings of the IEEK Conference
    • /
    • 1998.06a
    • /
    • pp.285-288
    • /
    • 1998
  • Viterbi docoder is a maximum likelihood decoding method for convolution coding used in satellite and mobile communications. In this paper, a Viterbi decoder with constraint length of K=7, 3-soft decision and traceback depth of $\Gamma$=96 for convolution code is implemented using VHDL. The hardware size of designed decoder is reduced by 4 bit pre-traceback in the survivor memory.

  • PDF