• Title/Summary/Keyword: performance scalability

Search Result 652, Processing Time 0.025 seconds

Design and Implementation of a Massively Parallel Multithreaded Architecture: DAVRID

  • Sangho Ha;Kim, Junghwan;Park, Eunha;Yoonhee Hah;Sangyong Han;Daejoon Hwang;Kim, Heunghwan;Seungho Cho
    • Journal of Electrical Engineering and information Science
    • /
    • v.1 no.2
    • /
    • pp.15-26
    • /
    • 1996
  • MPAs(Massively Parallel Architectures) should address two fundamental issues for scalability: synchronization and communication latency. Dataflow architecture faces problems of excessive synchronization overhead and inefficient execution of sequential programs while they offer the ability to exploit massive parallelism inherent in programs. In contrast, MPAs based on von Neumann computational model may suffer from inefficient synchronization mechanism and communication latency. DAVRID (DAtaflow/Von Neumann RISC hybrID) is a massively parallel multithreaded architecture which takes advantages of von Neumann and dataflow models. It has good single thread performance as well as tolerates synchronization and communication latency. In this paper, we describe the DAVRID architecture in detail and evaluate its performance through simulation runs over several benchmarks.

  • PDF

Study on Accelerating Distributed ML Training in Orchestration

  • Su-Yeon Kim;Seok-Jae Moon
    • International journal of advanced smart convergence
    • /
    • v.13 no.3
    • /
    • pp.143-149
    • /
    • 2024
  • As the size of data and models in machine learning training continues to grow, training on a single server is becoming increasingly challenging. Consequently, the importance of distributed machine learning, which distributes computational loads across multiple machines, is becoming more prominent. However, several unresolved issues remain regarding the performance enhancement of distributed machine learning, including communication overhead, inter-node synchronization challenges, data imbalance and bias, as well as resource management and scheduling. In this paper, we propose ParamHub, which utilizes orchestration to accelerate training speed. This system monitors the performance of each node after the first iteration and reallocates resources to slow nodes, thereby speeding up the training process. This approach ensures that resources are appropriately allocated to nodes in need, maximizing the overall efficiency of resource utilization and enabling all nodes to perform tasks uniformly, resulting in a faster training speed overall. Furthermore, this method enhances the system's scalability and flexibility, allowing for effective application in clusters of various sizes.

Performance Analysis for Multimedia Video Codec on On-Chip Network (온칩 네트워크 기반 멀티미디어 비디오 코덱 성능 분석)

  • Chang, J.Y.;Kim, W.J.;Byun, K.J.;Eum, N.W.
    • Smart Media Journal
    • /
    • v.1 no.1
    • /
    • pp.27-35
    • /
    • 2012
  • In this paper, the performance analysis for multimedia video codec(MPEG-4, H.264) on on-chip network communication architecture is presented. The On-Chip Network (OCN) is the new communication architecture of multimedia SoC design that overcomes the limits of On-Chip Bus architecture by providing higher data traffic bandwidth, reusability and higher scalability. We compared the performance of MPEG-4, H.264 decoder based on-chip network and AMBA on-chip bus. Experimental results show that the performance of MPEG-4, H.264 based on on-chip network is improved over 33~56% compared to the design based on AMBA on-chip bus.

  • PDF

Multi-View Supporting VR/AR Visualization System for Supercomputing-based Engineering Analysis Services (슈퍼컴퓨팅 기반의 공학해석 서비스 제공을 위한 멀티 뷰 지원 VR/AR 가시화 시스템 개발)

  • Seo, Dong Woo;Lee, Jae Yeol;Lee, Sang Min;Kim, Jae Seong;Park, Hyung Wook
    • Korean Journal of Computational Design and Engineering
    • /
    • v.18 no.6
    • /
    • pp.428-438
    • /
    • 2013
  • The requirement for high performance visualization of engineering analysis of digital products is increasing since the size of the current analysis problems is more and more complex, which needs high-performance codes as well as high performance computing systems. On the other hand, different companies or customers do not have all the facilities or have difficulties in accessing those computing resources. In this paper, we present a multi-view supporting VR/AR system for providing supercomputing-based engineering analysis services. The proposed system is designed to provide different views supporting VR/AR visualization services depending on the requirement of the customers. It provides a sophisticated VR rendering directly dependent on a supercomputing resource as well as a remotely accessible AR visualization. By providing multi-view centric analysis services, the proposed system can be more easily applied to various customers requiring different levels of high performance computing resources. We will show the scalability and vision of the proposed approach by demonstrating illustrative examples with different levels of complexity.

Enhanced MAC Scheme to Support QoS Based on Network Detection over Wired-cum-Wireless Network

  • Kim, Moon;Ye, Hwi-Jin;Cho, Sung-Joon
    • Journal of information and communication convergence engineering
    • /
    • v.4 no.4
    • /
    • pp.141-146
    • /
    • 2006
  • In these days, wireless data services are becoming ubiquitous in our daily life because they offers several fundamental benefits including user mobility, rapid installation, flexibility, and scalability. Moreover, the requests for various multimedia services and the Quality of Service (QoS) support have been one of key issues in wireless data communications. Therefore the research relative to Medium Access Control (MAC) has been progressing rapidly. Especially a number of QoS-aware MAC schemes have been introduced to extend the legacy IEEE 802.11 MAC protocol which has not guaranteed any service differentiation. However, none of those schemes fulfill both QoS features and channel efficiency although these support the service differentiation based on priority. Therefore this paper studies a novel MAC scheme, referred to as Enhanced Distributed Coordination Function with Network Adaptation (EDCF-NA), for enhancements of both QoS and medium efficiency. It uses a smart factor denoted by ACK rate and Network Load Threshold (TH). In this paper, we study how the value of TH has effect on MAC performance and how the use of optimal TH pair improves the overall MAC performance in terms of the QoS, channel utilization, collision rate, and fairness. In addition, we evaluate and compare both the performance of EDCF-NA depending on several pairs of TH and the achievement of various MAC protocols through simulations by using Network Simulator-2 (NS-2).

TinyIBAK: Design and Prototype Implementation of An Identity-based Authenticated Key Agreement Scheme for Large Scale Sensor Networks

  • Yang, Lijun;Ding, Chao;Wu, Meng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.11
    • /
    • pp.2769-2792
    • /
    • 2013
  • In this paper, we propose an authenticated key agreement scheme, TinyIBAK, based on the identity-based cryptography and bilinear paring, for large scale sensor networks. We prove the security of our proposal in the random oracle model. According to the formal security validation using AVISPA, the proposed scheme is strongly secure against the passive and active attacks, such as replay, man-in-the middle and node compromise attacks, etc. We implemented our proposal for TinyOS-2.1, analyzed the memory occupation, and evaluated the time and energy performance on the MICAz motes using the Avrora toolkits. Moreover, we deployed our proposal within the TOSSIM simulation framework, and investigated the effect of node density on the performance of our scheme. Experimental results indicate that our proposal consumes an acceptable amount of resources, and is feasible for infrequent key distribution and rekeying in large scale sensor networks. Compared with other ID-based key agreement approaches, TinyIBAK is much more efficient or comparable in performance but provides rekeying. Compared with the traditional key pre-distribution schemes, TinyIBAK achieves significant improvements in terms of security strength, key connectivity, scalability, communication and storage overhead, and enables efficient secure rekeying.

Multiple Constraint Routing Protocol for Frequency Diversity Multi-channel Mesh Networks using Interference-based Channel Allocation

  • Torregoza, John Paul;Hwang, Won-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.12
    • /
    • pp.1632-1644
    • /
    • 2007
  • Wireless Mesh Networks aim to attain large connectivity with minimum performance degradation, as network size is increase. As such, scalability is one of the main characteristics of Wireless Mesh Networks that differentiates it from other wireless networks. This characteristic creates the need for bandwidth efficiency strategies to ensure that network performance does not degrade as the size of the network increase. Several researches have been done to realize mesh networks. However, the researches conducted were mostly focused on a per TCP/IP layer basis. Also, the studies on bandwidth efficiency and bandwidth improvement are usually dealt with as separate issues. This paper aims to simultaneously study bandwidth efficiency and improvement. Aside from optimizing the bandwidth given a fixed capacity, the capacity is also increased using results of physical layer studies. In this paper, the capacity is improved by using the concept of non-overlapping channels for wireless communication. A channel allocation scheme is conceptualized to choose the transmission channel that would optimize the network performance parameters with consideration of chosen Quality of Service (QoS) parameters. Network utility maximization is used to optimize the bandwidth after channel selection. Furthermore, a routing scheme is proposed using the results of the network utilization method and the channel allocation scheme to find the optimal path that would maximize the network gain.

  • PDF

An Efficient Guitar Chords Classification System Using Transfer Learning (전이학습을 이용한 효율적인 기타코드 분류 시스템)

  • Park, Sun Bae;Lee, Ho-Kyoung;Yoo, Do Sik
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.10
    • /
    • pp.1195-1202
    • /
    • 2018
  • Artificial neural network is widely used for its excellent performance and implementability. However, traditional neural network needs to learn the system from scratch, with the addition of new input data, the variation of the observation environment, or the change in the form of input/output data. To resolve such a problem, the technique of transfer learning has been proposed. Transfer learning constructs a newly developed target system partially updating existing system and hence provides much more efficient learning process. Until now, transfer learning is mainly studied in the field of image processing and is not yet widely employed in acoustic data processing. In this paper, focusing on the scalability of transfer learning, we apply the concept of transfer learning to the problem of guitar chord classification and evaluate its performance. For this purpose, we build a target system of convolutional neutral network (CNN) based 48 guitar chords classification system by applying the concept of transfer learning to a source system of CNN based 24 guitar chords classification system. We show that the system with transfer learning has performance similar to that of conventional system, but it requires only half the learning time.

Efficient Update Method for Cloud Storage System

  • Khill, Ki-Jeong;Lee, Sang-Min;Kim, Young-Kyun;Shin, Jaeryong;Song, Seokil
    • International Journal of Contents
    • /
    • v.10 no.1
    • /
    • pp.62-67
    • /
    • 2014
  • Usually, cloud storage systems are developed based on DFS (Distributed File System) for scalability and reliability reasons. DFSs are designed to improve throughput than IO response time, and therefore, they are appropriate for batch processing jobs. Recently, cloud storage systems have been used for update intensive applications such as OLTP and so on. However, in DFSs, in-place update operations are not carefully considered. Therefore, when updates are frequent, I/O performance of DFSs are degraded significantly. DFSs with RAID techniques have been proposed to improve their performance and reliability. Their performance degradation caused by frequent update operations can be more significant. In this paper, we propose an in-place update method for DFS RAID exploiting a differential logging technique. The proposed method reduces the I/O costs, network traffic and XOR operation costs for RAID. We demonstrate the efficiency of our proposed in-place update method through various experiments.

A Study on dynamic gateway system for MOST GATEWAY Scheduling Algorithm (MOST GATEWAY 스케줄링 알고리즘에 관한 연구)

  • Jang, Seong-Jin;Jang, Jong-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.05a
    • /
    • pp.289-293
    • /
    • 2011
  • In our previous research, we proposed a MOST GATEWAY system for organically connected to the network MOST150 and MOST 25 and we proposed a simulation design method for performance analysis of Scheduling Algorithm in MOST GATEWAY system. Therefore in this paper, after comparing the performance among the existing scheduling algorithm methods in MOST25 and MOST150 Networks, we use NS-2 simulator in order to analyze the performance. Finally, we present an improvement scheme of the efficiency and scalability.

  • PDF