Search | Korea Science

공유 메모리를 갖는 다중 프로세서 컴퓨터 시스팀의 설계 및 성능분석

Choe, Chang-Yeol;Park, Byeong-Gwan;Park, Seong-Gyu;O, Gil-Rok
- ETRI Journal
- /
- v.10 no.3
- /
- pp.83-91
- /
- 1988
This paper describes the architecture and the performance analysis of a multiprocessor system, which is based on the shared memory and single system bus. The system bus provides the pended protocol for the multiprocessor environment. Analyzing the processor utilization, address/data bus utilization and memory conflicts, we use a simulation model. The hit ratio of private cache memory is a major factor on the linear increase of the performance of a shared memory based multiprocessor system.
PDF

The development of the high effective and stoppageless file system for high performance computing (High Performance Computing 환경을 위한 고성능, 무정지 파일시스템 구현)

Park, Yeong-Bae;Choe, Seung-Hwan;Lee, Sang-Ho;Kim, Gyeong-Su;Gong, Yong-Jun
- Proceedings of the Korea Contents Association Conference
- /
- 2004.11a
- /
- pp.395-401
- /
- 2004
In the current high network-centralized computing and enterprising environment, it is getting essential to transmit data reliably at very high rates. Until now previous client/server model based NFS(Network File System) or AFS(Andrew's Files System) have met the various demands but from now couldn't satisfy those of the today's scalable high-performance computing environment. Not only performance but data sharing service redundancy have risen as a serious problem. In case of NFS, the locking issue and cache cause file system to reboot and make problem when it is used simply as ip-take over for H/A service. In case of AFS, it provides file sharing redundancy but it is not possible until the storage supporting redundancy and equipments are prepared. Lustre is an open source based cluster file system developed to meet both demands. Lustre consists of three types of subsystems : MDS(Meta-Data Server) which offers the meta-data services, OST(Objec Storage Targets) which provide file I/O, and Lustre Clients which interact with OST and MDS. These subsystems with message exchanging and pursuing scalable high-performance file system service. In this paper, we compare the transmission speed of gigabytes file between Lustre and NFS on the basis of concurrent users and also present the high availability of the file system by removing more than one OST in operation.
PDF

Distributed In-Memory Caching Method for ML Workload in Kubernetes (쿠버네티스에서 ML 워크로드를 위한 분산 인-메모리 캐싱 방법)

Dong-Hyeon Youn;Seokil Song
- Journal of Platform Technology
- /
- v.11 no.4
- /
- pp.71-79
- /
- 2023
In this paper, we analyze the characteristics of machine learning workloads and, based on them, propose a distributed in-memory caching technique to improve the performance of machine learning workloads. The core of machine learning workload is model training, and model training is a computationally intensive task. Performing machine learning workloads in a Kubernetes-based cloud environment in which the computing framework and storage are separated can effectively allocate resources, but delays can occur because IO must be performed through network communication. In this paper, we propose a distributed in-memory caching technique to improve the performance of machine learning workloads performed in such an environment. In particular, we propose a new method of precaching data required for machine learning workloads into the distributed in-memory cache by considering Kubflow pipelines, a Kubernetes-based machine learning pipeline management tool.
PDF

Node Incentive Mechanism in Selfish Opportunistic Network

WANG, Hao-tian;Chen, Zhi-gang;WU, Jia;WANG, Lei-lei
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.3
- /
- pp.1481-1501
- /
- 2019
In opportunistic network, the behavior of a node is autonomous and has social attributes such as selfishness.If a node wants to forward information to another node, it is bound to be limited by the node's own resources such as cache, power, and energy.Therefore, in the process of communication, some nodes do not help to forward information of other nodes because of their selfish behavior. This will lead to the inability to complete cooperation, greatly reduce the success rate of message transmission, increase network delay, and affect the overall network performance. This article proposes a hybrid incentive mechanism (Mim) based on the Reputation mechanism and the Credit mechanism.The selfishness model, energy model (The energy in the article exists in the form of electricity) and transaction model constitute our Mim mechanism. The Mim classifies the selfishness of nodes and constantly pay attention to changes in node energy, and manage the wealth of both sides of the node by introducing the Central Money Management Center. By calculating the selfishness of the node, the currency trading model is used to differentiate pricing of the node's services. Simulation results show that by using the Mim, the information delivery rate in the network and the fairness of node transactions are improved. At the same time, it also greatly increases the average life of the network.
https://doi.org/10.3837/tiis.2019.03.021 인용 PDF KSCI HTML

A Block Replacement Scheme using Analytic Hierarchy Process in Hybrid HDD (하이브리드 하드디스크에서 AHP를 적용한 블록 교체 기법)

Kim, Jeong-Won
- Journal of Korea Society of Industrial Information Systems
- /
- v.20 no.5
- /
- pp.45-52
- /
- 2015
The read performance of hybrid hard disk is better than the legacy hard disk and power consumption is also considerably low. As blocks with enough localities may be located in the non-volatile cache whose size is generally limited, an effective block replacement scheme is required. As this replacement is inevitably affected by various parameters, we define this issue as a kind of multiple criteria decision model. To solve this problem, this paper suggests a new block replacement algorithm based on the analytic hierarchy process. Through simulation for our model, we confirmed that the proposed model could be used as a replacement algorithm of the hybrid hard disk as it may improve boot time as well as response time of general applications.
https://doi.org/10.9723/jksiis.2015.20.5.045 인용 PDF KSCI

Technique for Estimating the Number of Active Flows in High-Speed Networks

Yi, Sung-Won;Deng, Xidong;Kesidis, George;Das, Chita R.
- ETRI Journal
- /
- v.30 no.2
- /
- pp.194-204
- /
- 2008
The online collection of coarse-grained traffic information, such as the total number of flows, is gaining in importance due to a wide range of applications, such as congestion control and network security. In this paper, we focus on an active queue management scheme called SRED since it estimates the number of active flows and uses the quantity to indicate the level of congestion. However, SRED has several limitations, such as instability in estimating the number of active flows and underestimation of active flows in the presence of non-responsive traffic. We present a Markov model to examine the capability of SRED in estimating the number of flows. We show how the SRED cache hit rate can be used to quantify the number of active flows. We then propose a modified SRED scheme, called hash-based two-level caching (HaTCh), which uses hashing and a two-level caching mechanism to accurately estimate the number of active flows under various workloads. Simulation results indicate that the proposed scheme provides a more accurate estimation of the number of active flows than SRED, stabilizes the estimation with respect to workload fluctuations, and prevents performance degradation by efficiently isolating non-responsive flows.
PDF

Design and Performance Analysis of High Performance Processor-Memory Integrated Architectures (고성능 프로세서-메모리 혼합 구조의 설계 및 성능 분석)

Kim, Young-Sik;Kim, Shin-Dug;Han, Tack-Don
- The Transactions of the Korea Information Processing Society
- /
- v.5 no.10
- /
- pp.2686-2703
- /
- 1998
The widening pClformnnce gap between processor and memory causes an emergence of the promising architecture, processor-memory (PM) integration In this paper, various design issues for P-M integration are studied, First, an analytical model of the DRAM access time is constructed considering both the bank conflict ratio and the DRAM page hit ratio. Then the points of both the performance improvement and the perfonnance bottle neck are found by the proposed model as designing on-chip DRAM architectures. This paper proposes the new architecture, called the delayed precharge bank architecture, to improve the perfonnance of memory system as increasing the DRAM page hit ratio. This paper also adapts an efficient bank interleaving mechanism to the proposed architecture. This architecture is verified !II he better than the hierarchical multi-bank architecture as well as the conventional bank architecture by executiun driven simulation. Eight SPEC95 benchmarks are used for simulation as changing parameters for the cache architecture, the number of DRAM banks, and the delayed time quantum.
PDF

Social-Aware Collaborative Caching Based on User Preferences for D2D Content Sharing

Zhang, Can;Wu, Dan;Ao, Liang;Wang, Meng;Cai, Yueming
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.3
- /
- pp.1065-1085
- /
- 2020
With rapid growth of content demands, device-to-device (D2D) content sharing is exploited to effectively improve the service quality of users. Considering the limited storage space and various content demands of users, caching schemes are significant. However, most of them ignore the influence of the asynchronous content reuse and the selfishness of users. In this work, the user preferences are defined by exploiting the user-oriented content popularity and the current caching situation, and further, we propose the social-aware rate, which comprehensively reflects the achievable contents download rate affected by the social ties, the caching indicators, and the user preferences. Guided by this, we model the collaborative caching problem by making a trade-off between the redundancy of caching contents and the cache hit ratio, with the goal of maximizing the sum of social-aware rate over the constraint of limited storage space. Due to its intractability, it is computationally reduced to the maximization of a monotone submodular function, subject to a matroid constraint. Subsequently, two social-aware collaborative caching algorithms are designed by leveraging the standard and continuous greedy algorithms respectively, which are proved to achieve different approximation ratios in unequal polynomial-time. We present the simulation results to illustrate the performance of our schemes.
https://doi.org/10.3837/tiis.2020.03.009 인용 PDF KSCI HTML

The Reliable Multicast Transport Protocol over Wireless Convergence Networks using a Retransmission Agent (재전송 Agent를 이용한 유무선 융합망에서의 신뢰성 있는 멀티캐스트 전송 방식)

Youm, Sungkwan;Yu, Sunjin
- Journal of the Korea Convergence Society
- /
- v.7 no.4
- /
- pp.25-32
- /
- 2016
When using reliable multicast protocol over air links, the multicast packets lost in the air link cause the initiation of retransmission request packets and the implosion of retransmission packets, which deteriorate multicast session performance. This paper proposes on the efficient reliable multicast mechanism in wireless networks utilizing the Agents. In this paper we show the design of a retransmission agent which improves the performance of reliable multicast sessions in wireless network. The main idea is to cache reliable multicast packets at the base station and perform local retransmissions across the wireless link. MATLAB has been used to simulate and to get performance results for signaling overhead and processing delay through the comparison of the proposed agent model to the Multicast File Transfer Protocol. It has been proven from the simulation results that the proxy module make pass trials shorter in Multicast File Transfer Protocol.
https://doi.org/10.15207/JKCS.2016.7.4.025 인용 PDF KSCI

Performance Analysis of the Parallel CUPID Code for Various Parallel Programming Models in Symmetric Multi-Processing System (Symmetric Multi-Processing 시스템에서 다양한 병렬 기법 모델을 적용한 병렬 CUPID 코드의 성능분석)

Jeon, Byoung Jin;Lee, Jae Ryong;Yoon, Han Young;Choi, Hyoung Gwon
- Transactions of the Korean Society of Mechanical Engineers B
- /
- v.38 no.1
- /
- pp.71-79
- /
- 2014
A parallelization of the bi-conjugate gradient solver for the pressure equation of the CUPID (component unstructured program for interfacial dynamics) code, which was developed for analyzing the components of a pressurized water-cooled reactor, was studied in a symmetric multi-processing system. The parallel performance was investigated for three typical parallel programming models (MPI, OpenMP, Hybrid) by solving incompressible backward-facing step flow at various grid resolutions. It was confirmed that parallel performance was low when problem size was small or the memory requirement for each thread was considerably higher than the cache memory. Furthermore, it was shown that MPI was better than OpenMP regardless of the problem size, and Hybrid was the best when the number of threads was relatively small.
https://doi.org/10.3795/KSME-B.2014.38.1.071 인용 PDF KSCI

Search Result 57, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)