• Title/Summary/Keyword: high performance computing

Search Result 1,110, Processing Time 0.022 seconds

Computational Methods for On-Node Performance Optimization and Inter-Node Scalability of HPC Applications

  • Kim, Byoung-Do;Rosales-Fernandez, Carlos;Kim, Sungho
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.4
    • /
    • pp.294-309
    • /
    • 2012
  • In the age of multi-core and specialized accelerators in high performance computing (HPC) systems, it is critical to understand application characteristics and apply suitable optimizations in order to fully utilize advanced computing system. Often time, the process involves multiple stages of application performance diagnosis and a trial-and-error type of approach for optimization. In this study, a general guideline of performance optimization has been demonstrated with two class-representing applications. The main focuses are on node-level optimization and inter-node scalability improvement. While the number of optimization case studies is somewhat limited in this paper, the result provides insights into the systematic approach in HPC applications performance engineering.

Performance Evaluation and Analysis of Multiple Scenarios of Big Data Stream Computing on Storm Platform

  • Sun, Dawei;Yan, Hongbin;Gao, Shang;Zhou, Zhangbing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.7
    • /
    • pp.2977-2997
    • /
    • 2018
  • In big data era, fresh data grows rapidly every day. More than 30,000 gigabytes of data are created every second and the rate is accelerating. Many organizations rely heavily on real time streaming, while big data stream computing helps them spot opportunities and risks from real time big data. Storm, one of the most common online stream computing platforms, has been used for big data stream computing, with response time ranging from milliseconds to sub-seconds. The performance of Storm plays a crucial role in different application scenarios, however, few studies were conducted to evaluate the performance of Storm. In this paper, we investigate the performance of Storm under different application scenarios. Our experimental results show that throughput and latency of Storm are greatly affected by the number of instances of each vertex in task topology, and the number of available resources in data center. The fault-tolerant mechanism of Storm works well in most big data stream computing environments. As a result, it is suggested that a dynamic topology, an elastic scheduling framework, and a memory based fault-tolerant mechanism are necessary for providing high throughput and low latency services on Storm platform.

Power Modeling Approach for GPU Source Program

  • Li, Junke;Guo, Bing;Shen, Yan;Li, Deguang;Huang, Yanhui
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.1
    • /
    • pp.181-191
    • /
    • 2018
  • Rapid development of information technology makes our environment become smarter and massive high performance computers are providing powerful computing for that. Graphics Processing Unit (GPU) as a typical high performance component is being widely used for both graphics and general-purpose applications. Although it can greatly improve computing power, it also delivers significant power consumption and need sufficient power supplies. To make high performance computing more sustainable, the important step is to measure it. Current power technologies for GPU have some drawbacks, such as they are not applicable for power estimation at the early stage. In this article, we present a novel power technology to correlate power consumption and the characteristics at the programmer perspective, and then to estimate power consumption of source program without prerunning. We conduct experiments on Nvidia's GT740 platform; the results show that our power model is more accurately than regression model and has an average error of 2.34% and the maximum error of 9.65%.

Algorithm for Improving the Computing Power of Next Generation Wireless Receivers

  • Rizvi, Syed S.
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.4
    • /
    • pp.310-319
    • /
    • 2012
  • Next generation wireless receivers demand low computational complexity algorithms with high computing power in order to perform fast signal detections and error estimations. Several signal detection and estimation algorithms have been proposed for next generation wireless receivers which are primarily designed to provide reasonable performance in terms of signal to noise ratio (SNR) and bit error rate (BER). However, none of them have been chosen for direct implementation as they offer high computational complexity with relatively lower computing power. This paper presents a low-complexity power-efficient algorithm that improves the computing power and provides relatively faster signal detection for next generation wireless multiuser receivers. Measurement results of the proposed algorithm are provided and the overall system performance is indicated by BER and the computational complexity. Finally, in order to verify the low-complexity of the proposed algorithm we also present a formal mathematical proof.

Scalable Approach to Failure Analysis of High-Performance Computing Systems

  • Shawky, Doaa
    • ETRI Journal
    • /
    • v.36 no.6
    • /
    • pp.1023-1031
    • /
    • 2014
  • Failure analysis is necessary to clarify the root cause of a failure, predict the next time a failure may occur, and improve the performance and reliability of a system. However, it is not an easy task to analyze and interpret failure data, especially for complex systems. Usually, these data are represented using many attributes, and sometimes they are inconsistent and ambiguous. In this paper, we present a scalable approach for the analysis and interpretation of failure data of high-performance computing systems. The approach employs rough sets theory (RST) for this task. The application of RST to a large publicly available set of failure data highlights the main attributes responsible for the root cause of a failure. In addition, it is used to analyze other failure characteristics, such as time between failures, repair times, workload running on a failed node, and failure category. Experimental results show the scalability of the presented approach and its ability to reveal dependencies among different failure characteristics.

Implementation of Ring Topology Interconnection Network with PCIe Non-Transparent Bridge Interface (PCIe Non-Transparent Bridge 인터페이스 기반 링 네트워크 인터커넥트 시스템 구현)

  • Kim, Sang-Gyum;Lee, Yang-Woo;Lim, Seung-Ho
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.3
    • /
    • pp.65-72
    • /
    • 2019
  • HPC(High Performance Computing) is the computing system that connects a number of computing nodes with high performance interconnect network. In the HPC, interconnect network technology is one of the key player to make high performance systems, and mainly, Infiniband or Ethernet are used for interconnect network technology. Nowadays, PCIe interface is main interface within computer system in that host CPU connects high performance peripheral devices through PCIe bridge interface. For connecting between two computing nodes, PCIe Non-Transparent Bridge(NTB) standard can be used, however it basically connects only two hosts with its original standards. To give cost-effective interconnect network interface with PCIe technology, we develop a prototype of interconnect network system with PCIe NTB. In the prototyped system, computing nodes are connected to each other via PCIe NTB interface constructing switchless interconnect network such as ring network. Also, we have implemented prototyped data sharing mechanism on the prototyped interconnect network system. The designed PCIe NTB-based interconnect network system is cost-effective as well as it provides competitive data transferring bandwidth within the interconnect network.

Integrated Platform on the Basis of Heterogeneous Data to Support the Establishment of an Innovative Ecosystem for National High-Performance Computing: Focusing on Life Science & Public Health Area (국가 초고성능컴퓨팅 혁신 생태계 구축 지원을 위한 이종데이터 기반 통합 플랫폼: 생명·보건분야를 중심으로)

  • Do-Yeon Lee;Myoung-Ju Koh;Jae-Gyoon Hahm;Keun-Hwan Kim
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.26 no.1
    • /
    • pp.1-14
    • /
    • 2023
  • To secure national future competitiveness, the Korean government announced the 『National Ultra-High Performance Computing (HPC) Innovation Strategy (2021.5.28.)』 and set three innovation strategy goals throughout establishing an innovation ecosystem. This study presented a heterogenous data-based strategic support framework that allowed to understand both the current status of domestic & foreign R&D areas and domestic industrial economy areas in terms of strategic fields related to ultra-high performance computing, and the empirical research was conducted in the life science and public health area. The HPC innovation ecosystem platform based on the connection of heterogeneous data (domestic R&D project-technology-industry-overseas R&D project) presented in this study provided useful and essential information that allowed establishing a specific action plan for the national HPC innovation strategy and contributing to vitalizing the innovation ecosystem. Since the evidence-based policy assumes that a more reasonable consensus is reached through a non-biased decision- making process among stakeholders, the proposed platform may contribute to enhancing policy momentum by increasing legitimacy and trust of planning of the national HPC strategy.

A Study on the Implementation Method for the Achievement of the Korea High-Performance Computing Innovation Strategy

  • Choi, Youn Keun;Koh, Myoungju;Jung, Youg Hwan;Hur, YoungJu;Lee, Yeonjae;On, Noori;Hahm, Jaegyoon
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.spc
    • /
    • pp.76-85
    • /
    • 2022
  • At the 8th National High-Performance Computing (HPC) Committee convened in 2021, the "National High-Performance Computing Innovation Strategy (draft) for the 4th Industrial Revolution Era" was deliberated and the original draft was approved. In this proposal, the Ministry of Science and ICT in KOREA announced three major plans and nine detailed projects with the vision of "Realizing the 4th industrial revolution quantum jumping by leaping into a high-performance computing powerhouse." Thereby the most important policy about national mid-term and long-term HPC development was established and called the HPC innovation strategy (hereinafter "the innovation strategy"). The three plans of the innovation strategy proposed by the government are: Strategic HPC infrastructure expansion; Secure source technologies; and Activate innovative HPC utilization. Each of the detailed projects has to be executed nationally and strategically. In this paper, we propose a strategy for the implementation of two items ("Strategic HPC infrastructure expansion" and "activate innovative HPC utilization") among these detailed plans.

Toward High Utilization of Heterogeneous Computing Resources in SNP Detection

  • Lim, Myungeun;Kim, Minho;Jung, Ho-Youl;Kim, Dae-Hee;Choi, Jae-Hun;Choi, Wan;Lee, Kyu-Chul
    • ETRI Journal
    • /
    • v.37 no.2
    • /
    • pp.212-221
    • /
    • 2015
  • As the amount of re-sequencing genome data grows, minimizing the execution time of an analysis is required. For this purpose, recent computing systems have been adopting both high-performance coprocessors and host processors. However, there are few applications that efficiently utilize these heterogeneous computing resources. This problem equally refers to the work of single nucleotide polymorphism (SNP) detection, which is one of the bottlenecks in genome data processing. In this paper, we propose a method for speeding up an SNP detection by enhancing the utilization of heterogeneous computing resources often used in recent high-performance computing systems. Through the measurement of workload in the detection procedure, we divide the SNP detection into several task groups suitable for each computing resource. These task groups are scheduled using a window overlapping method. As a result, we improved upon the speedup achieved by previous open source applications by a magnitude of 10.

Hybrid in-memory storage for cloud infrastructure

  • Kim, Dae Won;Kim, Sun Wook;Oh, Soo Cheol
    • Journal of Internet Computing and Services
    • /
    • v.22 no.5
    • /
    • pp.57-67
    • /
    • 2021
  • Modern cloud computing is rapidly changing from traditional hypervisor-based virtual machines to container-based cloud-native environments. Due to limitations in I/O performance required for both virtual machines and containers, the use of high-speed storage (SSD, NVMe, etc.) is increasing, and in-memory computing using main memory is also emerging. Running a virtual environment on main memory gives better performance compared to other storage arrays. However, RAM used as main memory is expensive and due to its volatile characteristics, data is lost when the system goes down. Therefore, additional work is required to run the virtual environment in main memory. In this paper, we propose a hybrid in-memory storage that combines a block storage such as a high-speed SSD with main memory to safely operate virtual machines and containers on main memory. In addition, the proposed storage showed 6 times faster write speed and 42 times faster read operation compared to regular disks for virtual machines, and showed the average 12% improvement of container's performance tests.