Search | Korea Science

Design of the new parallel processing architecture for commercial applications (상용 응용을 위한 병렬처리 구조 설계)

한우종;윤석한;임기욱
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.33B no.5
- /
- pp.41-51
- /
- 1996
In this paper, anew parallel processing system based on a cluster architecture which provides scalability of a parallel processing system while maintains shared memory multiprocessor characteristics is proposed. In recent days low cost, high performnce microprocessors have led to construction of large scale parallel processing systems. Such parallel processing systems provides large scalability but are mainly used for scientific applications which have large data parallelism. A shared memory multiprocessor system like TICOM is currently used as aserver for the commercial application, however, the shared memory multiprocessor system is known to have very limited scalability. The proposed architecture can support scalability and performance of the parallel processing system while it provides adaptability for the commerical application, hence it can overcome the limitation of the shared memory multiprocessor. The architecture and characteristics of the proposed system shall be described. A proprietary hierarchical crsossbar network is designed for this system, of which the protocol, routing and switching technique and the signal transfer technique are optimized for the proposed architecture. The design trade-offs for the network are described in this paper and with simulation usihng the SES/workbench, it is explored that the network fits to the proposed architecture.
PDF

A Remote Cache Coherence Protocol for Single Shared Memory in Multiprocessor System (단일 공유 메모리를 가지는 다중 프로세서 시스템의 원격 캐시 일관성 유지 프로토콜)

Kim, Seong-Woon;Kim, Bo-Gwan
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.42 no.6
- /
- pp.19-28
- /
- 2005
The multiprocessor architecture is a good method to improve the computer system performance. The CC-NUMA provides a single shared space with the physically distributed memories is used widely in the multiprocessor computer system. A CC-NUMA has the full-mapped directory for the shared memory md uses a remote cache memory for tile fast memory access. In this paper, we propose a processing node architecture for a CC-NUMA system and a cache coherency protocol on the physically distributed but logically shared system. We show an implementation result of the system which is adopted the cache coherency protocol.
PDF KSCI

공유 메모리를 갖는 다중 프로세서 컴퓨터 시스팀의 설계 및 성능분석

Choe, Chang-Yeol;Park, Byeong-Gwan;Park, Seong-Gyu;O, Gil-Rok
- ETRI Journal
- /
- v.10 no.3
- /
- pp.83-91
- /
- 1988
This paper describes the architecture and the performance analysis of a multiprocessor system, which is based on the shared memory and single system bus. The system bus provides the pended protocol for the multiprocessor environment. Analyzing the processor utilization, address/data bus utilization and memory conflicts, we use a simulation model. The hit ratio of private cache memory is a major factor on the linear increase of the performance of a shared memory based multiprocessor system.
PDF

A Design of Pipelined Memory Access Control for Multiprocessor Systems and its Evaluation (다중프로세서시스테멩 대한 파이프라인 방식 메모리 접근제어의 설계와 그 효율분석)

김정두;손윤구
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.25 no.8
- /
- pp.927-936
- /
- 1988
This paper proposes a pipelined memory access method as a new technique for a bus interface between processors and memories in tightly coupled multiprocessor systems. Since the shared bus is bottle neck of the system, model of pipelined access to memory has been developed. Results of the evaluation by the discrete time Markov model showed a significant improvement of the efficiency.
PDF

Performance Evaluation for a Multiprocessor Computer System Using a Commercial Workload (상용 작업부하를 이용한 다중프로세서 컴퓨터 시스템 성능 평가)

박진원
- Journal of the Korea Society for Simulation
- /
- v.8 no.1
- /
- pp.35-49
- /
- 1999
The CC-NUMA based, distributed shared memory is an emerging architecture for multiprocessor computer systems because of its scalability and easy of programming. In this paper, we analyzed performance of a ring-based, CC-NUMA multiprocessor computer system using a commercial workload targeted for popular OLTP applications. Based on the traces collected from real machines, the characteristics of the commercial workload could be obtained. The simulation results showed that the bottleneck on the ring could be effectively removed by using a dual ring structure. We believe our simulation methodology and results will help us to design better multiprocessor computer systems for commercial application domains.
PDF

A Dual Slotted Ring Organization for Reducing Memory Access Latency in Distributed Shared Memory System (분산 공유 메모리 시스템에서 메모리 접근지연을 줄이기 위한 이중 슬롯링 구조)

Min, Jun-Sik;Chang, Tae-Mu
- The KIPS Transactions:PartA
- /
- v.8A no.4
- /
- pp.419-428
- /
- 2001
Advances in circuit and integration technology are continuously boosting the speed of processors. One of the main challenges presented by such developments is the effective use of powerful processors in shared memory multiprocessor system. We believe that the interconnection problem is not solved even for small scale shared memory multiprocessor, since the speed of shared buses is unlikely to keep up with the bandwidth requirements of new powerful processors. In the past few years, point-to-point unidirectional connection have emerged as a very promising interconnection technology. The single slotted ring is the simplest form point-to-point interconnection. The main limitation of the single slotted ring architecture is that latency of access increase linearly with the number of the processors in the ring. Because of this, we proposed the dual slotted ring as an alternative to single slotted ring for cache-based multiprocessor system. In this paper, we analyze the proposed dual slotted ring architecture using new snooping protocol and enforce simulation to compare it with single slotted ring.
PDF

Two-Level Multi-Scan Scheduler Using Resource Partition Strategy by Loose Processor-Affinity

Sohn, Jong-Moon;Kim, Gil-Yong
- Journal of Electrical Engineering and information Science
- /
- v.2 no.3
- /
- pp.105-112
- /
- 1997
The performance of a shared memory multiprocessor system is very sensitive to process scheduling. w can enhance the performance of a whole system as well as of an individual process by taking the multiprocessor characteristics into account in the design of the process scheduler. In this paper, we proposed a general purpose scheduler for a shared memory multiprocessor, called the Two-Level Multi-Scan (TLMS) process scheduler, that considers the processor affinity loosely and decreases the interference among multiple processors greatly. The TLMS scheduler is composed of a local scheduler at each processor and a semi-global scheduler that balances the load among processors. In particular, the semi-global scheduler tries to minimize priority inversion, which is an important factor of the system performance. The TLMS scheduler also tries to reduce the number of resources to be shared and improves the processor utilization. to meet these requirements, th semi-global scheduler interacts with the operation of the local scheduler when a need arises, thus the name is loose processor-affinity. We also show that the proposed scheduling technique can be extended for other types of resources making it a general purpose resource management queue.
PDF

Analysis of the Influence of the Conflict Management Policy of the Transactional Memory on the System Performance and Bus Traffic (시스템 성능 및 버스 트래픽에 대한 트랜잭셔널 메모리의 충돌 관리 정책 영향 분석)

Kim, Young-Kyu;Moon, Byungin
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.37B no.11
- /
- pp.1041-1049
- /
- 2012
The transactional memory was proposed to solve the problems of the conventional lock-based synchronization methods in the shared memory multiprocessor system. Various implementation methods for putting the high performance transactional memory to practical use have been continuously studied. However, these studies focus only on the commercialization and performance enhancement of the transactional memory. Besides, there have been few studies to analyze the system overhead of the transactional memory according to the conflict management policy. Thus this paper classifies hardware transactional memory, which is one kind of transactional memories, into four types according to the conflict management policy, and then compares and analyzes their performance and system bus traffic through their modeling and simulation. In addition, the most effective conflict management policy for the hardware transactional memory is presented through these comparison and analysis.
https://doi.org/10.7840/kics.2012.37B.11.1041 인용 PDF KSCI

Multi-Programmed Simulation of a Shared Memory Multiprocessor System (공유메모리 다중프로세서 시스템의 다중 프로그래밍 모의실험 기법)

최효진;전주식
- Journal of KIISE:Computer Systems and Theory
- /
- v.30 no.3_4
- /
- pp.194-204
- /
- 2003
The performance of a shared memory multiprocessor system is dependent on the system software such as scheduling policy as well as hardware system. Most of existing simulators, however, do not support simulation for multi-programmed environment because they can execute only a single benchmark application at a time. We propose a multi-programmed simulation method on a program-driven simulator, which enables the concurrent executions of multiple parallel workloads contending for limited system resources. Using the proposed method, system developers can measure and analyze detailed effects of resource conflicts among the concurrent applications as well as the effects of scheduling policies on a program-driven simulator. As a result, the proposed multi-programmed simulation provides more accurate and realistic performance projection to design a multiprocessor system.
PDF KSCI

Cache Coherence Protocols in NUMA Multiprocessors (NUMA 다중 프로세서에서의 캐쉬 일관성 프로토콜)

Moh, Sang-Man;Hahn, Woo-Jong;Yoon, Suk-Han
- Electronics and Telecommunications Trends
- /
- v.13 no.5 s.53
- /
- pp.11-22
- /
- 1998
Recently, scalable multiprocessor systems are actively developed for general-purpose computing, which are based on distributed shared memory (DSM) architecture to boost up both programmability and scalability. In this paper, we survey and analyze cache coherence protocols in non-uniform memory access (NUMA) multiprocessor systems. In particular, it has been easily inferred that specialized hardware suitable for NUMA multiprocessor systems with commodity symmetric multiprocessors (SMPs) is highly required. The cache coherence protocol combined with specialized hardware can significantly improve the performance and scalability of NUMA multiprocessor systems, providing better programmability.
https://doi.org/10.22648/ETRI.1998.J.130502 인용 PDF

Search Result 52, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)